Conditional Analysis and Unconditional Conclusion in Clinical Trial with Interim Analysis

In a properly defined statistical analysis, the basic requirements include a complete specification of the sample space and a statistical model used in inference. The two are considered as key components for a statistical analysis, if there is any fuzzy description in the key components, the statistical analysis lost its foundation. For a clinical trial with interim analysis, although many techniques to deal with the error inflation caused by the repeated significance tests have been developed, the specifications of the key components were not complete, the clinical trial with interim analysis needs a further refinement.


Introduction
In a properly defined statistical analysis, the basic requirements include a complete specification of the sample space and a statistical model used in inference. The two are considered as key components for a statistical analysis, if there is any fuzzy description in the key components, the statistical analysis lost its foundation. For a clinical trial with interim analysis, although many techniques to deal with the error inflation caused by the repeated significance tests have been developed, the specifications of the key components were not complete, the clinical trial with interim analysis needs a further refinement.
A typical clinical trial design with interim analysis has been shown in Xia [1], a letter to the editor Chen [2] criticized the paper for failure to show the detail statistical model associated with the analysis. The rejoinder prepared by the original authors Xia [3] showed that the conditional probability was applied in the analysis. By default, a conditional statistical analysis can only reach a conditional statistical conclusion, but the claimed study medication efficacy in clinical trial is generally an unconditional statement. The relationship between the unconditional trial conclusion and the conditional statistical model was not explicitly specified.
The challenge can be understand in a more simple and straightforward way. A clinical trial is a scientific experiment, the experiment result or the trial conclusion has to be shown and interpreted through a scientific way. Consider a clinical trial with interim analysis as a stochastic process passing through a series of quality gates, for N processes started at the beginning of the trial, the number of the processes passing through all gates and reaching the final accept region can be expected. The claimed α value should be consistent to the ratio of this expected value to . N Any clinical trial design not able to layout such an outcome is subject to criticisms.
Consider the data accumulation in a clinical trial with interim analysis to be a stochastic process, there are three main purposes for this manuscript. The first is to clarify the sample space and the statistical model for a clinical trial analysis; the second is to clarify the relationship between the conditional analysis and the unconditional trial conclusion; the third is to show the process has to be ergodic and stationary.
As a significant discovery, the simulation result shows that the error inflation caused by the repeated significance tests is in doubt. In fact, the error inflation is related to the algorithm for the hypothesis test and the parameter estimation. In practice, the error spending philosophy presented in Lan [4] is widely used without a detailed discussion of the hypothesis test algorithm, such a practice may have a negative impact on the confidence of the hypothesis test.

Setup of the sample space
It is noted that based on the setup of the sample space, a clinical trial with interim analysis can be interpreted in different ways, which may lead to different statistical models. Obviously, the random samples for different trial stages will fall on different subsets of the full sample space, if the sample space is assumed to be unchanged, the analysis for each stage can only be a conditional analysis.
The problem of Xia [1] is that it did not explicitly define the sample space in the clinical trial with interim analysis. The rejoinder Xia [3] claimed that the sample space Ω to be a K dimensional discrete space, with K orthogonal axes. The fundamental question is on the statistical model defined on such a space, and how to derive the trial conclusion in such a sample space. More specifically, the algorithm of .
axes of a Cartesian coordinate system, the summation should be understand as an operation in a vector space, not the simple arithmetic summation.
One of the most important tasks for this manuscript is to show that the sample space for the trial with interim analysis will remain same as Ω which could not be the multiple dimensional.
Any statistical inference is with respect to Ω and the fixed statistical measure . P Without such an explicit setup of the analysis foundation, the type I error referred in Xia [1] does not have a basic support. In the other words, it needs an explanation how does α be interpreted.

Conditional analysis and unconditional conclusion
When performing a conditional analysis, most of the estimations are in fact the conditional estimations. For example, within a census database, if the mean age of the female subjects is estimated, the output will be the mean age of the female population. Obviously, the mean age of the female population can be different from that of the general population, but the definition of the mean remains the same.
Return to the clinical trial with interim analysis, at the final stage of the trial, only the samples passed the ( 1) k th − interim gate can be observed. It is for sure a conditional estimate, how can this estimate be interpreted as the unconditional conclusion depends upon the trial setup. Obviously, upto this point, such a setup is not obvious in available literature.

Basic assumptions
In a clinical trial analysis, assume the data accumulation forms a stochastic process, with or without any interim analysis, the process has to be ergodic and stationary, see Walters [5]. For ergodicity, it implies that the process averages over the time will be the same as it averages over the sample space, so the estimation accuracy can be improved through the prolongation of the process observation. For stationary, it implies that the statistical model remains the same over the time, any statistical conclusion is regarding to an unchanged statistical model. Besides, a proper statistical analysis requires the sample space remains the same over the time.
If these assumptions are met in the clinical trial design have to be declared explicitly in the protocol or the statistical analysis plan, they serve as the pre-requisites for the statistical analysis. If it is necessary, the properties have to be shown. If the proof is impossible, the assumptions have to be shown in trial set up, and the assumptions could not be violated during the trial progress.
On the other hand, for a clinical trial with interim analysis, if there is an error in inflation caused by the repeated significance tests, the error inflation should be intrinsic. In the other words, the error inflation may depend upon the estimation algorithm, the algorithm which causes the smallest error inflation should be applied.

Error inflation and its elimination
The error inflation caused by the repeated significance test is a well known phenomenon, for example, see Bonferroni [6]. But unfortunately, such a phenomenon is just an observation in practice, its existence has never been mathematically proved. The observed error inflation may be caused by the estimation algorithm for the test, different algorithms for estimation may lead to different error inflation rate. For example, Anscombe [7] has observed that the inference through the likelihood maybe unaffected. The techniques to deal with the error inflation caused by the repeated significance tests also have many problems. For example, Armitage [8] has developed a quadrature to deal with the error inflation, The α spending methodology developed by Lan [4] used the similar quadrature in derivation, and the work of Xia [1] followed the same pattern un-explicitly. It is noted that both works did not discuss the estimation algorithm into detail, but the error inflation is sensitive to the algorithm. In the sense, without an extensive validity discussion regarding the spending, along with a different estimation algorithm, different α spending methodologies can be applicable, which may lead to a different statistical conclusion.

More criticisms on Xia's formulation
The statistical model presented in Xia [1] is ( ) The validation of this statistical model is in questions. The complete specification of the statistical model has to present what is the information which can be directly observed from the data, and what are the parameters to be estimated. For the parameters to be estimated, the complete specification has to outline the model and the algorithm for parameter estimation, since different models will lead to different ways in the estimation. More specifically, if r is a parameter has to be specified in the manuscript.
In Xia [1], for either case of 1 r = or 0.5, r = r is considered as a constant not a parameter. In the case, the hypothesis what kind of the trial outputs will be observed.
Besides, Xia [1] stated that " we assume equal allocation of person years into the vaccinated and placebo groups at each stage ( ) as is typically done. But as shown in Section 3 our methods are robust to non-equal allocations and nonconstant proportions." The statement is not true, since when k ρ changes over , k the process is no longer stationary, it is against the basic requirements for the analysis. Finally, the derivation of the "adjusted futility safety" boundary has not been described in detail, based on the information given by the manuscript, it is impossible to reproduce the result in the Table 1 of Xia [1].
Moreover, there are still some other challenges to the setup of the statistical model. Among them, the efficacy of the study medication is generally characterized by a unconditional statistical characteristic since the claim of the efficacy is generally an unconditional statement. The relationship between the unconditional conclusion and the above conditional statistical model has to be clarified. The conditional probability mass function (PMF) presented in the manuscript is a function of 1 ,..., , k t t in the other words, it implies that for a different set of 1 ,..., k T T values, saying 1 ,... , , k τ τ the conditional PMF will be different. It is against the basic definition of the study medication efficacy. The efficacy of the study medication is a unconditional measure which implies the efficacy of the study medication does not depends upon the conditions in which the clinical trial is performed. When a conditional probability model is used, it is necessary to establish the relationship between the conditional analysis and the unconditional conclusion. Without such a relationship, the conclusion of the clinical trial with interim analysis is not convincing.

Statistics Fundamentals
These statistics fundamentals are essential to the conditional probability analysis, they basically explained why the unconditional estimations can be obtained from the conditional analysis.

Sample space and measure
In statistical analysis, a σ -field (or aσ algebra) on a set Ω is a collection F of subsets of Ω that includes the empty subset, is closed under complement, and is closed under union or intersection of countably infinite subsets. The pair ( ) , F Ω is called a measurable space. There are three key motivators forσ -field, defining measures, manipulating limits of sets, and managing partial information.
To perform a statistical analysis, a measure P is defined in the measurable space. For a clinical trial analysis, it is assumed that the statistical model across different stages to remain the same, so the measure P will not change over stages. The triplet ( ) , , F P Ω forms a probability space.
2 , F Ω ⊂ where 2 Ω is its power set which contains all the subsets of , Ω is aσ -algebra.
For limits of sets, if When only partial information can be characterized with a smaller σ -algebra which is a subset of the principalσ algebra, it is considered as a sub σ -algebra. In our notation, the principal σ -algebra is , It is important to see that based on this set up, the unchanged P and Ω secured the requirements that the process to be stationary and the consistency of the sample space throughout the analysis.

Filtration
In the setup of the probability space ( ) , , , F P Ω I. •

III.
Essentially a filter is a sequence of σ -field such that each new σ -field corresponds to the additional information that becomes available at each step and thus the further refinement of the sample space . Ω It should be noted that the filtration with the second property is also called non-anticipating, i.e., one cannot see into the future.
At the kth stage of the clinical trial with interim analysis, the trial sponsor cannot arbitrarily assume the clinical outcome beyond the stage, the maximum prediction of the trial outcome beyond the kth stage is based on the data accumulation up to the kth stage. In fact, any prediction to the process beyond the kth stage is against the nature of the clinical trial goals.

P B A P B A P A
it is obvious that ( ) ( ) ( ) ( ).

P A B P B P B A P A =
It is important to understand the relationship 0 1 ...

Filtration and the foundation for analysis
Any statistical analysis has to be built upon a uniquely defined statistical distribution, . P It is crucial to realize that for an event in , µ has to be specified. The problem of Xia [1] is that it did not specify such a relationship explicitly, so his trial design is not convincing. In a clinical trial analysis, the results in subgroup analysis can only be considered as observational, they could not be a confirmative conclusion. It more or less reflects such a logic, but it generally did not give the reasons explicitly.
The relationship between ( ) E µ and ( ) i E F µ within our framework will be shown in later sections along with the definition of the filtration. It is in fact the key for this manuscript.

Filtration associated with interim analysis
It has been shown that throughout the clinical trial with interim analysis, the sample space Ω and the statistical model P remain unchanged, the statistical inference at all stages are with respect to Ω and . P Assume the statistical inference is based on the likelihood ratio test, which will be discussed into the detail in the following section, the test statistics will be 1 ,..., . . ,.. , k K λ λ λ The filtration with respect to Ω could be

Clinical trial with interim analysis
The study hypothesis is setup same as that in Chen [9], Let ,

Trial inference
It is noted that where the subscript k represent the kth interim analysis during the trial, 2 log k λ − is 2 χ random variable with degree of freedom (df) of 1 for 1, 2,..., , k K = then , ... It should be noted that during the inference, two of the fundamental assumptions have been made. First, all the process falls in the accept region at the ( ) interim gate will automatically proceed to the kth stage, and all process falls in the rejection region at the ( ) 1 K th − interim gate will be terminated. Second, the ergodic property is mandatory, so the K N N can be the estimate for 1 . α − In theory, 1 α − refers to the probability of the samples from one process fall in accept region over the time, but K N N refers to the result of multiple processes at a specific time point .
K The ergodicity property guarantee that the sample over time and the sample over sample space will remain the same.

Rationale of the trial design
Return to the previous discussion, the above analysis provides a rationale to link the observation to the trial goal. More specifically, at the Kth stage of the trial, only ( ) the trial design itself will have problem. In the sense, the above analysis is by no means redundant, it serves a key role to link the conditional analysis with the unconditional statistical conclusion.
Another important finding through the analysis is that the continuation rule and stop rule implied by the trial design are strict, any trial execution in violation to the rules will destroy the integrity of the trial. Although multiple violations may compensate each other, the strict quantification might be difficult. Besides, the concepts such as adjusted futility safety boundary or adjusted efficacy boundary are no longer necessary, since the likelihood ratio test is based on different test statistics compared with the conventional analysis.
It is noted that the above analysis is not a strict proof of ( ) ( ) ( the strict proof related to the property of 1 1 ... .
In fact, such a proof is related to the existence of the error inflation caused by the repeated significance tests. Our proof can be considered as a weak proof, since it at least showed that both

Example
As an example, a simulation is performed with the pseudo random number generator. The outcomes of SM and AC follow the Bernoulli distribution with same cure rate 0.5, p = so the efficacies of the SM and AC are the same. It is assumed that 0.05 α = which remain the same over the stages, there are 4 interim analyses with even space. As result, k α and corresponding critical values k c are distributed as in Table 1. χ distribution with degree of freedom 2. In each simulation run, 1000 pairs of SM and AC efficacy samples were created, and the interim gates were placed with even space over the 1000 pairs. The simulation is run 1,000 times, the number of the processes passing each of the interim gates, , k N are summarized.
The simulation show that among 1000 trial processes, at the first interim date, 950 trials concluded that the SM performed same as AC. The result is consistent to the presumption that 0.05. This simulation did not explicitly deal with the error inflation, but since the likelihood ratio test is used in statistical inference, the error inflation caused by the repeated significance tests has not been obviously observed. Besides, the likelihood ratio test is based on the mean of the random variables, not the lower or upper bound of the confidence interval, so the methodology of the spending is not used, the adjustment for the error inflation may not be necessary.

Interpretation of the confidence
In any clinical trial design, a confidence level is claimed. But, within the trial setup of Xia [1] (2007), the foundation is not clear. Specifically, a cited value has to associate with a sample space and a chosen statistical model. Within a conditional analysis, the independence of to the trial execution conditions has not been explicitly shown. If the trial execution conditions have impacts on the value, the trial conclusion will be in question.
In our setup of the clinical trial, the α is always measured by P on the foundation of Ω which remain unchanged throughout the trial. More specifically, α estimation is independent to the filtration 1 1 ,..., , K F F − so the trial conclusion is completely unconditional.
A debate may arise since it is shown that 1 α does not depends upon any of the previous interim gate criteria, so K N N is independent to all the interim gates. In fact, K N is not completely independent to k N for , k K < since obviously, we have K k N N < for all , k K < the trial stopped at any interim gate could not proceed to the final quality gate. The analysis showed that estimation across the sequence of the interim gates form a Markov chain. it will not be discussed into the detail and the door for further study remains open.

Distribution assumptions
In this manuscript, the binomial distribution is assumed. The same logic is also applicable to the statistical analysis based on Poisson model as well as the other statistical models. The rejoinder, Xia [3], disqualified the binomial model for the foundation that the comparative Poisson model dealt with the lower probability rate. But, the probability rate of the Poisson model is not discussed into the detail. In fact, if the understanding of the presented Poisson model is correct, the probability rate of the Poisson model is , i ρ which is assumed to be 0.5 which is close to the probability rate 0.5 p = in this binomial set up.
The key argument is that for any distribution assumptions, the clarification of the parameters and the estimation algorithm for the parameter estimation are mandatory for the model specification. The maximum likelihood estimates should be adopted, and the statistical inference should be based on the likelihood ratio test. The error inflation caused by the repeated significance tests may not exist. Naturally, this manuscript could not be considered as a theoretical proof that the likelihood ratio test completely eliminated the error inflation in practice, the topic still remains open for further discussion.

Main contributions
The main purpose of this manuscript is to show that no matter with a conditional analysis or a unconditional analysis, the statistical analysis of a clinical trial data is regarding to a fixed P and , Ω the trial conclusion should always be unconditional. In the other words, the trial conclusion is always independent to the conditions in which the trial was executed. On the other hand, consider the trial data accumulation form a stochastic process, the process has to be ergodic and stationary. If these properties were not secured, any trial conclusion will not be on a solid statistical foundation.