Biostatistics and Biometrics Open Access Journal

Biplots in Covariance Analysis

**Opeoluwa FO¹* and Sugnet L²**

¹Department of Statistics and Population Studies, University of Namibia, South Africa

²Department of Statistics and Actuarial Science, Stellenbosch University, South Africa

Submission: September 12, 2017; Published: November 10, 2017

*Corresponding author: Opeoluwa F Oyedele, Department of Statistics and Population Studies, University of Namibia, South Africa, Fax: +264612063791,Tel: +264612064515; Email: OpeoluwaOyedele@gmail.com

How to cite this article: Opeoluwa F, Sugnet L. Biplots in Covariance Analysis. Biostat Biometrics Open Acc J. 2017;3(4): 555623. DOI: 10.19080/BBOAJ.2017.03.555623

Abstract

Among the various statistical techniques useful for exploring the relationships between different sets of variables is the Covariance Analysis. Since biplots in general are useful graphical tools for exploring the relationships between (multivariate) variables, the biplot is employed in the covariance analysis framework to form the covariance biplot. The resulting biplot provides a single graphical display of the variables and inter-variables relationships. An illustration is shown using a mineral sorting production data consisting of five hundred and seventy-two processes.

Keywords: Biplots; Covariance matrix; Monoplots; Variance-covariance matrix

Abbreviations: PCA: Principal Component Analysis; CA: Correspondence Analysis; MCA: Multiple Correspondence Analysis; CVA: Canonical Variate Analysis; MDS: Multi-Dimensional Scaling; DA: Discriminant Analysis; GLMs: Generalized Linear Models; PLS: Partial Least Squares

Introduction

In a situation where the relationships between different sets of variables are of interest, various statistical techniques can be useful tools for analysis. Among them is the covariance matrix.Consider two centered matrices X₀:N x P and Y₀:N^x M The covariance between x₀ and Y₀ is defined by

However, when only one set of variables is under consideration, the variance-covariance matrix is defined by

This is also written as cov(X₀,X₀)=var (X₀,X₀)=cov(X₀)Here, the variances of X₀ are given in the diagonal of (2), while the covariances are shown off-diagonal. The relationships between different sets of variables can be explored using some form of graphical display such as biplots [1,2]. Since biplots are useful graphical tools for exploring the relationships between variables, the biplot is employed in the form of the covariance biplot. In this paper, the general idea behind the covariance biplot is discussed. It further demonstrates, with graphical illustrations, how the covariance biplot can help to reveal variables and inter-variables relationships. This paper is a progressed work of Oyedele & Gardner-Lubbe [3].

The remainder of this paper is organized as follows. Section 2 provides a brief overview of the biplot and its fundamental idea, before its employment in the form of the covariance biplot is discussed in Section 3. This is followed by an application with a mineral sorting production data that shows the quality evaluation of five hundred and seventy-two processes used to produce a final product in Section 4. Finally, some concluding remarks are offered in Section 5.

Biplots

Time and again, biplots, to be precise, asymmetric biplots, are often referred to as the multivariate version of scatter plots. In the usual two-dimensional scatter plot, two orthogonal Cartesian axes are used for reading off the values of the sample points, as well as for adding points to the plot. The fact that biplots are referred to as multivariate scatter plots implies that more than two variables are represented by (non-orthogonal) axes [4,1]. Just like scatter plots, biplots are helpful for revealing clustering, multivariate outliers, variables and inter-variable relationships of a data set [2]. An advantage of the biplot is that it allows for the visual assessment of a high-dimensional data matrix in a two- or three-dimensional plot.

Since the biplot was first introduced by Gabriel [5], its theory has been significantly extended with Gower & Hand's [1] monograph, Yan & Kang's [6] description of various methods to visualize and interpret a biplot, Greenacre’s [7] text on the use of biplots in practice and Gower, Lubbe and Le Roux's [8] illustration of the construction of various forms of biplots. In the first biplots introduced by Gabriel, the rows and columns of a data matrix were represented by vectors, but to differentiate between these two sets of vectors, Gabriel [5] suggested that the rows of the data matrix be represented by points. Gower & Hand [1] went a step further by introducing the idea of representing the columns of the data matrix by axes, rather than vectors, while still representing the rows of the data matrix by points. This was done to support their theory that biplots were the multivariate version of scatter plots. Gower & Hand's [1] biplot representation is very useful when the data matrix under consideration is a matrix of samples by variables.

An example of the biplot display of the chemical-sensory data by Mevik & Wehrens [9] is shown in Figure 1. This data shows the sensory descriptors and the chemical quality measurements of sixteen olive oils. In this biplot display, a representation of the variance of each variable is provided. This is represented by the thicker arrow (vector) on each axis in each display. For example, the standard deviation of Acidity is smaller compared to the others, while DK has a large deviation. This is evident from the length of the vector on these axes. Furthermore, several relationships can be deduced from this biplot, such as a relation between Syrup, K232 and Peroxide. Another relationships deduction is the relation between K270, Transp and Glossy, and between Green, Yellow and DK. These relationship deductions are done based on the angle between axes.

Since 1971, biplots have been employed in a number of multivariate methods as a form of graphical representation of data, pattern and data inspection, as well as for displaying results found by well-known statistical methods of analysis [1012]. The most well-known methods are Principal Component Analysis (PCA), Correspondence Analysis (CA), Multiple Correspondence Analysis (MCA), Canonical Variate Analysis (CVA), Multi-Dimensional Scaling (MDS), Discriminant Analysis (DA), Generalized Linear Models (GLMs), and recently, Partial Least Squares (PLS) by Oyedele & Lubbe [13]. All these forms of biplots have been applied to diverse fields of specialization, according to different needs and requirements.

Fundamental idea of biplots

A biplot is a joint graphical display of the rows and columns of a data matrix D (of G rows and H columns) by means of markers a₁,a₂,.....,a_G for its rows and markers b₁,b₂,.....,b_H for its columns. Each marker is chosen in such a way that the inner product a_i^Tb_j represents d_i,j, the (i,j)^th element of the data matrix D [14]. Biplots are often constructed in two dimensions. This does not mean that they are limited to two dimensions, but this is the most convenient biplot display. However, with D being a (very) large matrix, the rank of D is almost always higher than two. As a result, some approximation is done on D to obtain a lower rank. Methods such as PCA can be used to perform this approximation. In PCA, the approximation is based on the method of least squares. To be precise, the sum-of-squares of the differences between D and its approximation is minimized. That is, minimize trace

Taking as the rank two approximation of D, the biplot of a data matrix D relies on the decomposition of into the product of two matrices,

its row markers matrix (A) and its column markers matrix (B). Matrices A and B are defined as

Thus, the approximated rows and columns of a data matrix are represented in biplots. Generally, the number of columns in A and B are determined by the rank r approximation of D. In practice, r=2 is usually preferred for a convenient biplot display.

Biplot points and axes

In asymmetric biplots, the rows of a data matrix are represented by points, while the columns are represented by vectors or axes. Traditionally, columns are represented by vectors, but Gower & Hand [1] introduced axes to make the biplot similar to a scatter plot. This was done by extending the vectors, which represent the columns, through the biplot space to become axes. Thus, the biplot points will be defined by the row markers of the data matrix, whereas the biplot axes will be defined by the column markers. More precisely, for the biplot of a data matrix D, from equation (3), G rows of A will serve as the biplot points, while H rows of B will be used in calculating the directions of the biplot axes. An example of the biplot display can be seen in Figure 2.

Biplot implementation into covariance analysis framework

In general, there are two kinds of features displayed in biplots. These features can be specified as two sets of variables, or as a set of variables and samples, as in the case of the PCA biplot. This does not mean that biplots cannot be constructed by using only one kind of feature, but depending on the data matrix and the choice of features to be analyzed, biplots can be constructed to display only one kind of feature. Gower, Lubbe and Le Roux [8] termed such biplots monoplots. In a monoplot, the kind of feature to be represented may be the samples only or one set of variables. Including an additional feature in the monoplot, say, another set of variables, would result in a biplot. As only variables are represented in the covariance and variance-covariance matrices, see equations (1) and (2), both monoplots and biplots can be used as graphical tools to explore their relationships. More precisely, a monoplot would be suitable for representing a variance-covariance matrix (equation (2)) graphically, while a biplot would be more appropriate for a covariance matrix (equation (1)). An example of the resulting monoplot display can be seen in Figures 3 & 4, while an example of the resulting biplot display can be seen in Figure 2.

Covariance monoplot

Consider the X-variables only. By the Singular Value Decomposition(SVD),X₀=UΔV^T,for U : N×P,Δ:P×P and V : P×P Defining the matrices it follows that

Since G_X = H_x in (4), the rows of either G_X or H_x will be used in the monoplot.

Moreover, given that only one set of variables, in this case , is under consideration, and as the focus is on revealing the relationships within these variables, only one set of axes is needed. From expression (4), the directions of these axes are calculated by the P rows of either G_x or H_x .

Covariance biplots

Consider both x₀ and Y₀ The P×M covariance matrix between x₀ and Y₀ is defined in equation (1). Let S_xy= cov(x₀,Y₀).By the SVDS_xy = DΛF^T,, for^D:P×M, Λ:M×M and F: M×M . The matrix S_XY=DΛF^T, can be written as

S_xy≌ Ŝ_xy= DΛJF^T= DΛJΛF^T= DJΛJF^T= GH^T

Where

G= DΛ^βJ, and H = FΛ^1-βJ for any value of β ϵ (0,1) (5)

In expression (5), the matrix J has dimension M×M while the matrix J_r has dimension M×r The matrix G: P× r contains the information about the X-variables, while matrix H :M × r contains the information about the Y-variables.

Since, the inner-product between the rows of the matrix G and the rows of the matrix H approximates the covariances between the X-variables and the Y-variables. Here, the rows of G associates with the X-variables, while the rows of H associates with the Y-variables. Focusing on revealing the relationships between two sets of variables, X and Y, only axes will be present in the resulting biplot. However, two sets of axes are needed, a set for the X-variables and a set for the Y-variables. the directions of the axes representing the X-variables are calculated using the P rows of G, while M rows of H are used to calculate the directions of the axes representing the Y-variables. This biplot, called the covariance biplot, reveals the relationships between the two sets of variables as well as within each set.

From expression (5), when β = 1,

G = DΛ¹J_r = DΛJ_r and H = FΛ^1-1 J_r = FJ_r.

Also,

H^TH = (FJ_r )^T (FJ_r )= J^T_rF^TFJ_r = J_r^TJ_r = I_r

And

G^TG = (DΛJ_r)^T(DΛJ_r)= J^T_rΛD^TDΛJ_r= J^T_rΛ²J_r ≠ I_r

Where

F^TF = I_M, J^T_rJ_r = I_r and D^TD =I_M For this choice of ,

S_XYS^T_XY≅ GH^T HG^T = GG^T

But

S_XYS^T_XY≅ HG^TGH^T = HH^T .

Therefore, from S_XYS^T_XY≅ GH^T the row markers G approximate the covariance between the rows of S_XY the rows of S_XYare associated with the X-variables, the rows of G approximate the covariance between these X-variables.

Conversely, whenβ = 0 ,

G = DΛ⁰ J_r = DJ_r and H = F Λ^1-0 J_r = FΛJ_r.

G^TG = (DJ_r)^T (DJ_r) = J^T_rD^TDJ_r = J^T_rI_r = I_r Now,

For D^TD = I_M and J^T_rJ_r = I_r, but

H^TH = (FΛJ_r)^T (FΛJ_r) = J^T_rΛF^TFΛJ_r = J^T_r Λ²J_r ≠ I_r

Where F^TF = I_M. Thus,

S^T_XYS_XY ≅ HG^TGH^T = HH^T

And

S_XY^TS_XY ≅GH^THG^T≠GG^T.

From S_XY^TS_XY≅HH^T, the column markers H approximate the covariance between the columns of S_XY. In other words, with the columns of S_XY associated with the Y-variables, the rows of H approximate the covariance between these Y-variables.

Moreover, any β value between 0 and 1 will neither optimally approximate the covariance between the X-variables nor the covariance between the Y-variables, but rather, it will give an indication of both. Since choosing β closer to 0 better approximates the covariance between the Y-variables, the symmetric choice of will be used in the biplot.

With the choice of β = 0.5, covariance between the X- and Y-variables is both equally approximated, although not optimal for either. That is, for G= DΛ⁰⁵J_r H = FΛ^1-0.5J_r = FΛ^0.5J_r,

Where F^TF=I_M and D^TD = I_M . In this situation, the rows of HJ^T_rΛ^0.5. approximate (non-optimally) the covariance between the Y-variables, while the rows of GJ^R_rΛ⁰⁵ approximate (non- optimally) the covariance between the X-variables. In line with equation (1), the approximated covariance can be written as

Thus, in expression (5), β = 0 only caters for the covariance between the Y-variables optimally, while β = 1 only caters for the X-variables optimally. On the other hand, β = 0.5 caters for both X- and Y-variables equally, although not as optimally as when only one set is been catered for.

It should be noted that since only variables are being represented in the covariance monoplot/biplot and there are no samples to (orthogonally) project onto the axes representing these variables, calibration markers are not necessary on these axes.

An Illustration

The following illustration of the covariance biplot is performed using the SOVR data from Umetrics MKS [15]. This mineral sorting production data shows the quality evaluation of five hundred and seventy two processes used to produce a final product. Twelve process factors were used in the evaluation, namely, total load (TON_IN), load of grinder 30 (KR30_IN), load of grinder 40 (KR40_IN), concentration mull (PARM), velocity of separator 1 (HS_1), velocity of separator 2 (HS_2), effect of grinder 30 (PKR_30), effect of grinder 40 (PKR_40), ore waste (GBA), load of separator 3 (TON_S3), waste from grinding (KRAV_F) and total waste (TOTAVF). The aim of this evaluation was to investigate the relationships between the process factors and the quality of the final product. Six output variables, amount of concentration type 1 (PAR), amount of concentration type 2 (FAR), distribution of concentration type 1 and 2 (r-FAR), iron in FAR (Percent_Fe_FAR), phosphor in FAR (Percent_P_FAR) and iron in raw ore (Percent_Fe_malm), were used to measure the quality of the final product. The processes are assigned as the samples, while the process factors and output variables are the predictor and response variables respectively. Thus, the SOVR data can be viewed as a 572x18 data matrix, comprising of an X:572x12 matrix and a Y:572x6 matrix. A copy of this data can be found on the dropbox link [reference] under the "Data Sets" folder [16].

As is customary for covariance analysis, both the X:572×12 and Y:572 × 6 matrices are centred before the analysis. In addition, each of these matrices is standardized by dividing each centred variable by their respectively standard deviation, as this facilitates the direct comparison of correlation values. The (twodimensional) covariance biplot for the data is shown in Figure 2, with β = 0.5. The respective G:5×2 and H:6×2 matrices are shown in Tables 1 & 2 respectively. The approximated covariance values are shown in Table 3. The representation of the variance of each variable, represented by the thicker arrow (vector) on each axis, is shown in the biplot displays (Figures 2 to 4). Observing the length of the thicker arrows (vectors) on the axes in Figure 1, output variables PAR and FAR can be said to have the largest standard deviation. However, variable Percent_ Fe_malm has the smallest standard deviation, compared to the others, followed by process factor HS _1. These deductions can also be seen (clearly) in Figures 3 & 4.

Furthermore, the positions of the biplot axes give an indication ofthe correlations between the variables. Axes forming small angles are said to be strongly correlated - either positively or negatively. Axes are positively correlated when they lie in the same direction, while negatively correlated axes lie in opposite directions. Also, axes that are close to forming right angles are said to be uncorrelated. From Figure 2, various inter-variable relationships can be deduced, such as the relation between output variables FAR and r_FAR and process factors TOTAVF,GBA, PKR_40, KR30_IN, TON_IN, KR40_IN, KRAV_F and PKR_30. Looking at the directions of these axes in the biplot, this relation is a positive one. Also, the relation between process factor HS_2 and output variables Percent _ P _ FAR and ^Percent_^Fe _^FAR can be seen. The relation between factor HS_2 and Percent _{_} P _{_} FAR is a negative one, while the relation between factor HS_2 and Percent _Fe _FAR is a positive one. However, process factor Percent_{_}Fe_{_}maltn can be said to have no relation with the others.

Moreover, to illustrate how a covariance monoplot can help to reveal relationships within one set of variables, consider the monoplot of the process factors shown in Figure 3. From this monoplot display several relationships can be deduced, such as the relation within process factors TOTAVF, GBA, PKR_30, KRAV_F, TON_IN, KR40_IN, KR30_IN and PKR_40, with an approximated (positive) correlation values of 0.944, 0.876, 0.907, 0.927, 0.918, 0.900 and 0.921 respectively as shown in Table 4. Factors TON_ S3, HS_1 and HS_2 can be said to be unrelated to each other. Likewise, from the covariance monoplot of the output variables (Figure 4), output variables r_ ^FAR , PAR and FAR can be said to be related, with an approximated (positive) correlation values of 0.675, 0.705 and 0.920 respectively as shown in Table 5. Also, a (negative) relation within variables Percent_P_FAR and Percent _Fe _FAR (-0.864) can be noted.

Conclusion

The covariance matrix of two sets of variables can be visualized graphically using the biplot. The resulting biplot, termed the covariance biplot, reveals variables relationships graphically. If only one set of variables is considered in the covariance analysis, the resulting graphical representation is a covariance monoplot. Advantages of a covariance biplot include the revelation of the relationships between two sets of variables as well as within each set.

Software

The SOVR data can be obtained from the dropbox link [16] under the "Data Sets" folder. A collection of functions has been developed in the R programming language [17] to produce the biplot display of the chemical-sensory data in Figure 1 and the covariance biplot and monoplot displays of the SOVR data in Figures 2-4. These functions are available in the R package called PLSbiplot1 by Oyedele [18], and can be found on the Comprehensive R Archive Network (CRAN)'s repository, at [19] A detailed documentation for all the routines in this package can be found on the dropbox link, [20]

The following R code were used to obtain Figures 1-4

# Install the PLSbiplot1 package

# First download the PLSbiplot1_0.1.tar.gzfile from the

# CRAN at [19] and install into R.

#Load the PLSbiplot1 package require (PLSbiplot1)

#Olive oil data

if(require(pls))

data(oliveoil, package="pls")

Dmat = as.matrix(oliveoil)

dimnames(Dmat) = list(paste(c("G1","G2","G3","G4","G5","I1","I2","I3","I4","I5","S1","S2","S3","S4","S5","S6")),

paste(c('Acidity","Peroxide","K232","K270",”

DK","Yellow","Green","Brown","Glossy","