Simulation Study of Confounder Selection Procedures using the Hoffman's Collapsibility Test
Shiro Tanaka*
Department of Clinical Biostatistics, Kyoto University, Japan
Submission: May 29, 2017; Published: July 27, 2017
*Corresponding author: Shiro Tanaka, Department of Clinical Biostatistics, Graduate School of Medicine, Kyoto University, Kyoto, Japan, Email: gogosata@gmail.com
How to cite this article: Shiro T. Simulation Study of Confounder Selection Procedures using the Hoffman's Collapsibility Test. Biostat Biometrics Open Acc J. 2017: 2(3): 555590. DOI: 10.19080/BBOAJ.2017.02.555590.
Abbreviations:
Abbreviations: AA: Always Adjusted; AO: Always Omitted; CSP: Confounder Selection Procedure; CT: Collapsibility Test; LT: Log-Rank Test; RMSE: Root Mean Square Error
Introduction
Hoffman et al. proposed a convenient collapsibility test in Cox regression [1]. Collapsibility tests have been considered as an effective way to select confounders to be included in multiplicative rate models [2]. However, it is not reasonable to expect that the performance of a confounder-selection procedure (CSP) using the Hoffman's test is as shown in [2]. First, unlike multiplicative rate models, the regression coefficient of exposure in Cox regression is not collapsible even if the covariate is independent of exposure [3]. Second, impact of a CSP would be different when confounders are adjusted by propensity-score [4]. Although Hoffman et al. did not recommend applying their test to confounder selection problems, these issues should be addressed.
Simulation Settings
In this simulation study, we randomly generated 1000 data sets from a hypothetical cohort study of N=500 and 5000. The data sets included an observed survival time T, a censoring indicator 5, a binary exposure X=-l/2, 1/2, a known confounder Z1, and a potential confounder Z2, which were generated in four steps. First, Z follows a multivariate normal with mean zero, unit variance, and correlation r=0.5.Second, X given Z follows a Bernoulli distribution with a conditional probability given by logit{Pr(X=1/2|Z)}=log(0.5)+log(2)Z1+ϒZ2. Third, the true survival time T* given X and Z follows an exponential distribution with a hazard λ=0.1exp (βX+log(2)Z1+aZ2). Finally, T=min(T*, ηU), where U follows a uniform distribution and η was specified so that the censoring probability is 0.8.Table 1 presents results from the following combinations of parameters: {exp(ϒ), exp(α), exp(β)}=(1,1,3), (2,1,3), (1,2,3), (2,2,3) . After data generation, we estimated β by fitting four Cox regression models; λ(t)=λ0(t)exp(βX+α1Z1+α2Z2), λ(t)=λ0(t) exp(βX+α1Z1), λ(t)=λ0a(t)exp(PX), and λ(t)=λ0c(t)exp(βX), where λ0a(t) and λ0c(t) are stratified baseline hazard functions according to quintile of propensity-scores estimated by logistic regression with a covariate vector Z=(Z1, Z2) or Z2, respectively. These models were selected by four CSPs: Always Adjusted (AA), Always Omitted (AO), the Hoffman's Collapsibility Test (CT) [1], and Log-rank Test (LT). In the LT-procedure, Z2 is omitted if Z2 correlated with T significantly in Cox regression with a covariate vector Z. Significance levels were set as 0.2. The performance was evaluated in terms of bias and root MSE(RMSE) in β:Bias =
where β-AA and RMSEAA are the simulated mean and RMSE from the AA-procedure. A negative value in RMSE indicates improvement over the AA-procedure.
Simulation Results
In Table 1, biases in the AO-procedure were present in(1,2,3) , (2,2,1) and (2,2,3) in regression adjustment, while p was not collapsible numerically only in (2,2,1) and (2,2,3) in propensity-score stratification. As expected, the CT-procedure frequently included Z2 when p is not collapsible even if Z2 is not a confounder (41% in N=500, 95% in N=5000). Therefore, the Hoffman's test is a valid collapsibility test but cannot exclude a non-confounder if it correlates with T. Further, power of the CT- procedure to detect a confounder was consistently lower than the LT-procedure when N=500. Notably, in scenarios (1,2,1) and (1,2,3) and when propensity-score stratification was used, we see result similar to [4]; the AA-procedure improved RMSE by 2.4 to 4.2% from the AO-procedure, which always specified the correct exposure model but missed a non-confounder which correlates with T. As noted above, the CT-procedure failed to include Z2 in (1,2,1) and (1,2,3), yielding lower RMSE than the LT-procedure (0 to 0.7%). We also performed sensitivity analyses and observed qualitatively the same pattern. The efficiency gain in propensity-score analysis was high when prevalence of exposure or correlation across confounder was low, but very small when Z1 is high-dimensional, e.g. 20 variables. In conclusion, the Hoffman's test yields a less efficient estimate of exposure in confounder-selection problem (Table 1).
References
- Hoffmann K, Pischon T, Schulz M, Schulze MB, Ray J, et al. (2008) Statistical test for the equality of differently adjusted incidence rate ratios. Am J Epidemiol 167(5): 517-522.
- Maldonado G, Greenland S (1993) Simulation study of confounder selection strategies. Am J Epidemiol138: 923-936.
- Greenland S, Robins JM, Pearl J (1999) Confounding and collapsibility incausal inference. Statist Sci 14(1): 29-46.
- Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, et al. (2006) Variable selection for propensity score models. Am J Epidemiol 163(12):1149-1156.