Mingan Yang

doi:10.19080/BBOAJ.2020.10.555782

Mini review

Bayesian Mixed Effects Model with Variable Selection

Mingan Yang*

Division of Biostats and Epidemiology, San Diego State University, USA

Submission: June 22, 2020; Published: JAugust 21, 2020

*Corresponding author: Mingan Yang, Division of Biostats and Epidemiology, School of Public Health, San Diego State University, USA.

How to cite this article: Mingan Y. Bayesian Mixed Effects Model with Variable Selection. Biostat Biom Open Access J. 2020; 10(2): 555782. DOI: 10.19080/BBOAJ.2020.10.555782

Abstract

Recently, many approaches have been proposed to address the problem of selecting both fixed and random effects in mixed effects models. In this article, we review several approaches by comparing their procedures and performances, discussing their similarities and differences, and explaining their advantages and disadvantages.

Keywords: Variable selection; Random effects; Mixed effects model; Bayesian model selection; Parameter expansion; Stochastic search

Abbreviations: LME: Linear Mixed Effects; MCMC: Markov Chain Monte Carlo; AIC: Akaike Information Criterion; BIC: Bayesian Information Criterion; GIC: Generalized Information Criterion; SSVS: Stochastic Search Variable Selection

Introduction

Linear mixed effects (LME) models [1] are widely used in longitudinal studies to analyze correlated or clustered data. Generally, the random effects are incorporated to account for heterogeneity among the subjects. In analysis of an LME model, a primary objective is to select potential significant fixed effects and random effects of the outcome variables. Practically, one might be able to use a selection approach (e.g., back-ward elimination or forward selection) and apply a standard criterion, such as the Akaike information criterion (AIC), generalized information criterion (GIC), Bayesian information criterion (BIC), and Bayesian factor, to choose a preferred model by fitting all the possible models repeatedly. However, the number of competing models increases exponentially with the number of predictors. Thus, there are many challenging issues associated with the problem of joint selection of both fixed and random effects such as: intense computation, increased prediction error with increased number of covariates [2], bias associated in estimated variance of the fixed effects and near singular random effect covariance matrix for underfitting and overfitting of the random effects, respectively [3]. In this paper, we review several exemplary approaches by comparing their procedures and performances, investigating their similarities and differences, and explaining their advantages and disadvantages.

The Method

Suppose we have n subjects in a study and each subject has ni repeated observations. For i=1; … , n. Let yij denote the response variable for subject i at observation j; Xij the corresponding l x 1 predictor; and Zij a predictor vector of dimension q x1. Then, we de ne the LME model as follow:

subvector of ij X that include all the candidate predictors. We are interested in selecting a subset of important predictors in the model.

Chen & Dunson [4] addressed the problem by using the reparameterization approach of a modified Cholesky decomposition of the random effects covariance matrix. The covariance Ω can be decomposed as

excluded from the model. Γ is a lower triangular matrix,

with all diagonal elements being 1 and the other free elements characterizing correlations between the random effects. With this random effects covariance matrix decomposition, model (1) takes the form:

Chen & Dunson [4] showed that by rearranging terms, the covariance matrix of the random effects can be expressed

λand r keep desirable conditional conjugacy properties and we can construct a Gibbs sampling algorithm for sampling the posterior distribution using the SSVS approach.

The priors are specified as follows. Let

be the vector of coefficients for the selected fixed effects in the current model, JXbe the corresponding covariates matrix. An i.i.d. Bernoulli prior is assumed for

Jβis assumed to be a Zellner g-prior [5],

where IG(.) is an inverse Gamma distribution, 0(.)δdenotes a point mass at zero, N+(.) is a truncated positive normal distribution. The lower triangular free elements of Γ is put a normal distribution prior. For easy notation, we denote above zero-inflated truncated positive normal prior as ,~(0,1)kpkZINλ+. g is put a Gamma prior G(1/2; 1/2) and 2σa Jeffrey’s prior 21σ=or an inverse Gamma prior.

Chen & Dunson [4] specified the prior ~(0,).iNIξ Kinney & Dunson [6] used the approach of Gelman [7] and specified the covariance matrix ()(1,...,)iqVDiagddξ−. With this specification, the parameters ,ΔΓand ()iVξare not identifiable. In fact, Kinney & Dunson [6] took the parameter-expansion approach [8,9] that improved computational efficiency and reduces dependence among the parameters. We should note the Chen & Dunson [4] only considered variable selection of random effects. Kinney & Dunson [6] extended it to joint selection of both fixed and random effects for linear and logist models; in addition, the approach of Kinney & Dunson [6] also overcame the computational inefficiency due to slow mixing of the Gibbs sampler

Although it is simple and convenient to assume that the random effects are normally distributed. However, there are several limitations with such specifications: the assumption is often not reasonable; thus misspecification of random effects might result in misleading interpretation and even incorrect results. In addition, it is challenging to specify nonparametric distribution for the random effects since there is bias associated with fixed effects estimates when the expected values of random effects are not zero. To resolve the bias of random effects, Yang [10] and Yang (2013) [11] used the approaches of the Probit stick-breaking (PSB) and location-scale symmetrized PSB (sPSB) [12] for linear and logist models with joint variable selection for both fixed and random effects. They define

takes a Gaussian kernel. The heteroscedastic scale PSB mixture and heteroscedastic sPSB location-scale mixtures are defined as follows:

for the random effects in truncated form. Obviously, the nonidentifiability issue is resolved since both PSB and sPSB are centered at vector zero.

Yang (2012) [10] & Yang (2013) [11] provided nonparametric approaches for linear and logist models for joint variable selection of both fixed and random effects. Their approaches are much more flexible than those of Chen & Dunson [4] and Kinney & Dunson [6]. However, the computation is more intense. Later Yang et. al. [13] used the shrinkage priors for mixed effects models with variable selection. The approach is efficient in shrinking small coefficients to zero while minimally shrinking large coefficients due to the heavy tails. They use several popular shrinkage priors: generalized double pareto prior [14], the horse shoe prior [15], and normal-exponential-gamma prior [16], respectively, as follows for the fixed effects:

where Ca+(0; c) denotes a standard half-Cauchy distribution on the positive reals with scale parameter c. The performances of the shrinkage approaches are very good while the computations are not that intense.

Conclusion

In this article, we reviewed several approaches of linear and logistic models for joint selections of both xed and random effects. The approaches of Chen & Dunson [4] and Kinney & Dunson [6] are simple in implementation and provide reasonably good results. The approaches of Yang (2012) [10] & Yang (2013) [11] are much more flexible and provide much better results though the computations are intense. The approach of Yang et.al. [13] maintains a good balance of benefits of the above-mentioned parametric and nonparametric approaches. In summary, the performance and computation intensity by descending orders are Yang (2012) [10] & Yang (2013) [11], Yang et al. [13], Chen & Dunson [4] and Kinney & Dunson [6].

BBOAJ.MS.ID.555782

Our Media Partner

BBOAJ Menu

Useful Links

Downloads