Abstract
Incomplete crossover design, which only considers a subset of treatments under comparison, is frequently employed to evaluate the effects of various drug treatments. This design frequently involves binary data, presenting unique challenges such as restricted information, small sample sizes, and a lack of flexible analytical approaches. In this manuscript, we review several approaches for the incomplete crossover design and focus more on the recent approaches proposed, which overcome the above mentioned challenges.
Keywords: Binary Data; Crossover Trial; Incomplete Block Crossover Trial
Abbreviations: IBCD: Incomplete block crossover designs; CMLE: Conditional Maximum Likelihood Estimator; MSE: Mean Square Error
Introduction
Crossover trials, a popular variant of the randomized block design, involves administering multiple treatments to subjects across several periods, allowing each individual to serve as their own control and reducing variability in treatment comparisons. They are widely used in pharmaceutical and medical studies to compare treatments for diseases. For example, in the most commonly used crossover design [1,2] with only 2 treatments A and B, some subjects receive treatment A first and B second, while the others receive treatment B first and A second. Many current statistical literature focuses on crossover trials with continuous outcomes. However, there are more and more studies with a binary response, e.g., reflief/no relief or improvement/no improvement to evaluate drugs effects. However, limited approaches are proposed to address such studies.
Incomplete block crossover designs (IBCD) Senn [3] are often employed due to practical considerations such as resource constraints and potential subject dropout. In IBCD, each subject receives only a subset of treatments, presenting unique challenges like small sample sizes and limited data. There are some limited studies evaluating the IBCD. For example, Senn [3] proposed an approach for continuous data. Lui and Chang [4] proposed the weighted least squares approach for binary data. Lui (2017) further developed a conditional likelihood approach for binary data. However, the previous approaches have some limitations: either for only continuous data or difficulties to accommodate zero counts with binary data and subject to asymptotic theorem. Yang [5] proposed a Bayesian approach for the IBCD. In this article, we review and compare several popular approaches for the IBCD and their performance.
The article is organized as follows: Section 2 describes the Frequentist approaches. Section 3 describes the Bayesian approach of Yang [5]. Section 4 provides a simulation study. And section 5 concludes with a discussion.
Frequentist Approaches
General Description
Jones and Kenward [6] considered a study in a three-period crossover trial which compared two treatments and placebo for the relief of primary dysmenorrhea. They proposed a log-linear linear model which mirrored the analysis of continuous data. However, such studies have common challenges such as logistic supports, longer duration of studies and potential risk of being lost or follow-up in crossover trials. In addition, Jones and Kenward [6] proposed a fairly complicated model. Its main drawback with higher-order designs is that many extra parameters are needed.
To overcome the challenges of the IBCD, Lui [7] fitted a random effects logistic regression model and derived teh conditional maximum likelihood estimator (CMLE) for the relative effect between treatments with binary responses. For a study comparing two experimental treatment A and B with a placebo (P) under a 2-period crossover trial, denote X-Y the group with the treatment sequence in which a patient receives treatment X and then crossover to receive treatment Y and the second period. Lui [7] derived the logistic regression model:
where μgi denotes the random effects with the ith patient assigned to group g; ηAP and ηBP denote the relative treatments effect of A and B to the placebo, respectively. The relative treatment effect is estimated as
and an estimated asymptotic variance for log(ψAP) is obtained as
where ψAP = exp(ηAP). Based on the above equations, Lui [7] derived an approximate 100(1- α%) confidence interval for ψAP as
All the other relative treatment effects are derived similarly. Obviously, we can see the above results are fairly complicated. In particular, the IBCD studies generally have small sample sizes and thus the results might not be accurate. More detailed info can be referred to Lui [7].
Later on, Lui and Chang (2017) developed much simpler results using a logistic random effect model. For example, the estimated relative treatment effects ηAP
where ngsup>rc denotes the number of patients in group g withe
the vector of response
, where r=1,0, c=1,0 among ng patients.
Although the results seem very straight forward, however, there are many times those values n1rc or n1rc are 0. Thus, it would be very difficult to obtain the estimated treatment effects and the corresponding variances [8].
Bayesian Approach Yang [5]
To overcome the above challenges of the Frequentist approaches, Yang [5]proposed a Bayesian approach for the IBCD studies. For a general crossover design consisting of J periods and M treatments, denoted as T1, T2, · · · , TM. A subset of M treatments is applied during the J periods with the crossover design. For instance, a sequence (g(1), g(2), · · · , g(J)) is formed when the treatments Tg(1), · · · , Tg(J) are applied for the J periods. (Let ygij denote the continuous outcome from subject i at period j under treatment sequence g, the model can be specified as
where ψj is the fixed effect of the jth period; u0 is the overall mean; ηg is the fixed effect of the gth sequence, g = 1, · · · ,G; tl(g,j) is the fixed treatment effect, and l(g, j) is the treatment index; and ϵgij is a random error assumed to follow a normal distribution, μgi is the random effect of ith subject from the gth sequence, i = 1, · · · , ng with μgiN(0,ρ)2 In the above model, the carryover effect is omitted assuming sufficient washout between dosing periods.
Yang [5] further used several cutting-edge algorithms such as data augmentation, scaled mixture of normals representation, parameter expansion to improve efficiency. Specifically, the logistic model [9] is defined as follows:
where H(.) is the logistic link function with H(κ) = log(κ/(1 − κ)).
Obviously, the conditional conjugacy cannot be achieved with the posteriors since the logistic model is nonlinear. To overcome the cumbersome nonlinear issue, Yang [5] take several approaches of approximation to convert the nonlinear model to the standard linear models. First, we take the approach that the logistic distribution can be closely approximated by the t distribution (Albert and Chib [10]; Holmes and Knorr-Held [11]; O’Brien and Dunson [12]). With auxiliary variables, Model (6) is equivalent to the following representation:
where y*gij is an underlying value with the logistic distribution with location parameter u0+ηg+ψj +tl(g,j)?+μgi and density function as follows:
Second, as noted by West [13], the t distribution can be expressed as a scale mixture of normals. Thus y*gij is approximated as a non-central t distribution with location parameter u0+ηg+ψj+tl(g,j)+μgi, degree of freedom υ and scale parameter σ2. Then we can express it as a scale mixture of normals and get the following model:
where ϕgij has a Gamma prior G(υ/2, υ/2). As suggested by O’Brien and Dunson [12], we take υ = 7.3 and σ2 = π2(v−2)/3υ to make the approximation almost exact.
Priors and Posteriors
Yang [5] used the same priors for the continuous outcomes as in model (5). Yang [5] specified a normal distribution N(μ1,σ12) for the overall mean u0= N(μ1,σ12). Similar priors are selected for the sequence effect, period effect, and the treatment effect:
. The random effect μgi is specified as μgi∼N(0,ρ2), the hyperparameter ρ2 is placed an Inverse Gamma distribution ρ2 ∼ IG(a0, b0), and ϕgij is placed a prior of Gamma distribution G(υ/2, υ/2). Based on the model and prior specifications, we can easily derive the joint posterior distribution for θ = (u0,η,ψ,t,ϕ) as follows:
where wgij = {1(ygij* > 0) ygij + 1(ygij*< 0)(1 − ygij)}p(ϕgij), and p(.) = p(ρ2)p(u0)p(t)p(η)p(ψ). Obviously, this is a very complicated posterior formula that we cannot sample directly. By introducing the latent variable *ygij we have applied a data augmentation[14] algorithm and can easily sample the parameters and hyperparameters of interest using Gibbs sampler. From the above model, the auxiliary variable can be easily updated using the Gibbs sampler from a posterior normal distribution truncated below or above 0 according to the value of ygij .
The conditional posterior of the auxiliary variable is:
where qgij=u0+ηg+ψj +tl(g,j)+μgi. The full conditional posterior distributions is specified in (9). The procedures and most of the conditional posterior distributions [14] are very similar to those of the continuous outcomes. The detailed sampling steps are listed in the Appendix. We run the Gibbs sampler by iteratively sampling all the parameters, and hyperparameters of interest.
Simulation
To evaluate the performance of the above mentioned approaches, we conduct a simulation example for a logistic mixed effects models. We consider comparing two experimental treatments A, B, and placebo P under a two-period crossover design. We use C-D to denote the group with the treatment sequence in which a subject receives treatment C at the first period and treatment D at the second period. Thus, there are totally 6 groups (A-P, B-P, P-A,P-B, A-B, B-A). We assume that there are no carry-over effects with an adequate washout period for the simulation. We arbitrarily set the overall mean u0 equal to 0.10. We generate the random effects μgi independently and identically from a normal distribution with mean 0 and standard deviation 0.5, 1.0, and 2.0, respectively; We set the following four cases for treatment effects (placebo, A, and B): -0.15, 0.30, -0.15 (case 1); -0.25, 0.00, 0.25 (case 2);-0.15, -0.15, 0.30 (case 3); 0.15, 0.15, -0.30 (case 4); and the number of patients per group n=15, 20, 25. In clinical trials [15], researchers are generally interested in the relative effects among treatments. Thus, we focus on the results of the relative effects in the simulation. We use tcd = c−td to denote the relative treatment effect between treatment c and d. For the above 4 cases, the relative treatment effects (t21, t31, t32) are: 0.45, 0.00, -0.45 (case1); 0.25, 0.50, 0.25 (case 2); 0.00, 0.45, 0.45 (case 3); 0.00, -0.45, -0.45 (case 4); and the period effect and group effect are all set 0 since they are much less of interest compared to the relative treatment effects. For the priors, we specify μ1=μ2=μ3=μ4=0.5, σ1=σ2=σ3=σ4=5.0, and a0=b0=0.2.
We generate 500 simulated samples of n subjects per group with the bivariate outcome (ygi1, ygi2). About 5% (n=25, σ = 0.5) to 70% (n=15, σ = 2.0) data sets have zero frequency cells. We run the Gibbs sampling algorithm as described in the previous section and the Appendix. Three independent chains with widely dispersed starting values were run to assess convergence. After an initial 5,000 iterations, the scale reduction factors of the Gelman- Rubin [16] approach (Gelman and Rubin [17]) indicate good convergence. We use the next 20,000 iterations to calculate the parameter estimate for the parameters of interest. We also run simulations by varying the means and variances of the priors for the hyperparameters to evaluate the effects. We do not observe any noticeable differences in the parameter estimation.
(Table 1) provides the bias and mean square error (MSE) of the relative treatment effect difference of varying scenarios. (Table 2) provides the 95% coverage of the relative treatment effect difference. From the tables, we can see that the performance of our approach is pretty good. In particular, the approach of Yang [5], unlike those of Lui and Change [4] and Lui [7], does not suffer from zero frequencies for some cells. We also note that the percentage of generated data sets with zero frequency cells can be up to almost 80% for some cases. Although Lui [5] and Lui and Chang [4] suggested to add 0.5 for cells with zero counts, this would significantly impacted the estimated results.


Discussion
For crossover trials [18-20], there are many challenging issues such as logistic support, long duration of experiment, small sample size etc. Even for modelings of IBCD, there are many challenging issues of identifiability, reliability, intense computation etc. In this manuscript, we review several popular approaches to compare their results and performance: the frequentist approaches of Kenward and Jones [6], Lui and Chang [7]; Lui (2017); and the Bayesian approach of Yang [5]. Obviously, we can see that the frequentist approaches for IBCD easily suffer from barriers of cell counts of 0 while the Bayesian approach of Yang [5] does not. In particular, the approach of Yang [5] used several cutting-edge algorithms such as data augmentation, scaled mixture of normal representation, parameter expansion etc. and get the closed form for posterior distributions and improve efficiency. By extensive simulation, we can see that the Bayesian approach [21,22] provides very reliable and good results and performance.
Acknowledgment
The author thanks my colleagues for their comments and critical readings of the manuscript.
References
- Grieve AP (1985) A Bayesian analysis of the two-period crossover design for clinical trials. Biometrics 41: 979-990.
- Grieve AP (1995) Extending a Bayesian analysis of the two-period crossover to accommodate missing data. Biometrika 82: 277-286.
- Senn S (2002) Cross-over trials in clinical research, 2nd, Chichester: Wiley, USA, pp. 1-347.
- Lui KJ, Chang, KC (2015) Test and estimation in binary data analysis under an incomplete block crossover design. Computational Statistics and Data Analysis 81: 130-138
- Yang, M (2013) Bayesian nonparametric centered random effects models with variable selection. Biometrical Journal 55: 217-230.
- Jones B, Kenward MG (1987) Modelling binary data from a three-period cross-over trial. Statistics in Medicine 6: 555-564.
- Lui KJ (2015) Estimation of the treatment effect under an incomplete block crossover design in binary data-A conditional likelihood approach. Statistical Methods in Medical Research 26(5): 2197-2209.
- Polson NG, Scott JG, Windle J (2013) Bayesian inference for logistic models using polya-Gamma latent variables. Journal of the American Statistical Association 108: 1339-1349.
- Yang, M, Dunson DB, Baird D (2010) Semiparametric Bayes hierarchical models with mean and variance constraints. Computational Statistics and Data Analysis 54: 2172-2186.
- Albert J, Chib S (1997) Bayesian tests and model diagnostics in conditionally independent hierarchical models. Journal of the American Statistical Association 92: 916-925.
- Holmes C, Knorr-Held L (2003) Efficient simulation of Bayesian logistic regression models. Technical report, Ludwig Maximilians University Munich, USA.
- O’Brien SM and Dunson DB (2004) Bayesian multivariate logistic regression. Biometrics 60: 739-746.
- West M (1987) On scale mixtures of normal distributions. Biometrika 74: 646-648.
- Liu JS and Wu YN (1999) Parameter expansion for data augmentation. Journal of the American Statistical Association, 94: 1264-1274.
- Fleiss JL (1986) The design and analysis of clinical experiments. Wiley, New York, USA.
- Liu CH, Rubin DB, Wu YN (1998) Parameter expansion to accerate EM: The PX-EM algorithm. Biometrika 85: 755-770.
- Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Statistical Science 7: 457-511.
- Basu S, Santra S (2010) A joint model for incomplete data in crossover trials. Journal of Statistical Planning and Inference 140: 2839-2845.
- Hills M, Armitage P (1979) The two-period cross-over clinical trial. British Journal of Clinical Pharmacology 8: 7-20.
- Senn S (2006) Cross-over trials in statistics in medicine: the first 25 years. Statistics in Medicine 25: 3430-3442.
- Yang M, Dunson DB (2010) Bayesian semiparametric structural equation models with latent variables. Psychometrika 75: 675-693.
- Yang, M (2012) Bayesian variable selection for logistic mixed model with nonparametric random effects. Computational Statistics and Data Analysis 56: 2663-2674.

















