Constrained Bayesian Methods for Testing Directional Hypotheses Restricted False Discovery Rates
K. J. Kachiashvili1,2, I.A. Prangishvili1 and J. K. Kachiashvili1
1Faculty of Informatics and Control Systems, Georgian Technical University, Georgia
2I. Vekua Institute of Applied Mathematics of the Tbilisi State University, Georgia
Submission: November 01, 2018; Published: March 13, 2019
*Corresponding author: K J Kachiashvili, Faculty of Informatics and Control Systems, Georgian Technical University, Georgia and Vekua Institute of Applied Mathematics of the Tbilisi State University, Tbilisi, Georgia
How to cite this article: Kachiashvili KJ, Prangishvili IA. Constrained Bayesian Methods for Testing Directional Hypotheses Restricted False Discovery Rates. Biostat Biometrics Open Acc J. 2019; 9(3): 555761. DOI: 10.19080/BBOAJ.2019.09.555761
Abstract
Constrained Bayesian method (CBM) and the concept of false discovery rates (FDR) for testing directional hypotheses is considered in the paper. Here is shown that the direct application of CBM allows us to control FDR on the desired level. Theoretically it is proved that mixed directional false discovery rates (mdFDR) are restricted on the desired levels at the suitable choice of restriction levels at different statements of CBM. The correctness of the obtained theoretical results is confirmed by computation results of concrete examples.
Subject Classifications: 62F15; 62F03.
Keywords: Directional hypotheses; Constrained Bayesian method; False discovery rate; Mixed directional false discovery rates; False acceptance rate
Abbrevations: CBM: Constrained Bayesian method; DFDR: Directional False Discovery Rate; FDR: False Discovery Rates; MDFDR: Mixed Directional False Discovery Rates; FAR: False Acceptance Rate
Introduction
The traditional formulation of testing simple basic hypothesis versus composite alternative is a well studied problem in many scientific works [1-8]. The problem of making the sense about direction of difference between parameter values, defined by basic and alternative hypotheses, is important in many applications [9-17]. Here the decision whether the parameter outstrips or falls behind of the value defined by basic hypothesis is meaningful. For parametrical models, this problem can be stated as
![Click here to view Large Eq 1](images/BBOAJ.MS.ID.555761.E001.png)
Where θ is the parameter of the model, 0θ is known. These alternatives are called skewed or directional alternatives. The consideration of directional hypotheses found their applications in different realms. Among them are biology, medicine, technique and so on [17,18]. The appropriate tests “has just begun to stir up some interests in the educational and behavioral literature” [19-22].
Directional false discovery rate (DFDR) or mixed directional false discovery rate (mdFDR) are used when alternatives are skewed [17]. The optimal procedures controlling DFDR (or mdFDR) use two-tailed procedures assuming that directional alternatives are symmetrically distributed. Therefore, decision rule is symmetric in relation with the parameter’s value defined by basic hypothesis [14,23]. For the experiments where the distribution of the alternative hypotheses is skewed, the asymmetric decision rule, based on skew normal priors and used Bayesian methodology for testing when minimizing mdFDR, is offered in Bansal et al. [17]. There theoretically is proved “that a skewed prior permits a higher power in number of correct discoveries than if the prior is symmetric”. This result is confirmed by simulation study comparing the proposed rule with a frequentist’s rule and the rule offered in Benjamini, et al. [23]. Because CBM allows us to foresee the skewness by not only a prior probabilities but also by restriction levels in the constraints, it is expected that it will give more powerful decision rule in number of correct discoveries than existed symmetric or asymmetric in the prior decision rules. Therefore, different statements of CBM, for testing skewed hypotheses with restricted mdFDR, are considered below.
In Section 2 some possible statements of CBM for testing directional hypotheses are considered and the fact that FDR could be controlled on the desired level for each statement of CBM is proved. Concretization of the proposed theoretical results for the normally distributed directional hypotheses is given in Section 3. Computation results of concrete example for normal basic and truncated normal alternative hypotheses by simulation of the appropriate samples are given in section 4. Discussion of the obtained results and made conclusions are presented in sections 5 and 6, respectively.
CBM for testing directional hypotheses
Different statements of CBM for testing a set of hypotheses are given in Kachiashvili, et al. [24]; Kachiashvili, [25]; Kachiashvili et al. [26]. They differ from each other by the kind of restrictions put on the Type I or Type II errors made at testing. Let’s introduce the following denotations for statement of the problem of testing hypotheses [27]. Let the sample be generated from (px;θ) and the problem of interest is to test
are disjoint subsets with
iHpis the a priori probability of hypothesis
is a prior density with support
denotes the marginal density of x given
is the set of solutions, where
it being so that
associates each observation vector x with a certain decision
jΓ is the region of acceptance of hypothesis ,jH i.e., It is obvious that δ(x) is completely determined by the jΓ regions, i.e.
and
be the losses of incorrectly accepted and incorrectly rejected hypotheses. Then the total loss of incorrectly accepted and incorrectly rejected hypotheses
is the following:
Adapting the made denotations to skewed hypotheses, let’s consider some kind of CBM, from all possible statements, for testing directional hypotheses (1). (Notation 1: numbering of the tasks, described below, is preserved from Kachiashvili, et al. [27] 2.1. Restrictions on the averaged probability of acceptance of true hypotheses (Task 1). In this case, CBM has the following form Kachiashvili, et al. [28]: to minimize the averaged loss of incorrectly accepted hypotheses
Where 1r is some real number determining the level of the averaged loss of incorrectly rejected hypotheses. For directional hypotheses (1) and loss functions
using concepts of posterior probabilities, the problem (3), (4) transforms in the following form Kachiashvili et al. [28]
subject to
The solution of the problem (6) and (7) by Lagrange undetermined multiplier method gives
where Lagrange multiplier λ1 is determined so that in condition (7) equality takes place.
(Notation 2: for the statement (3), (4) as well as for other statements (see Tasks 2, 4 and 5, below), depending upon the choice of ,x there is a possibility that 1)(=xjδ for more than one j or 0)(=xjδ for all (),0,.
Let’s introduce denotations
and let’s call them individual average risks. Then for the average risk (6) we HAVE
At testing directional hypotheses, it is possible to make a false statement about choice among alternative hypotheses, i.e. to make a directional error, or a type III error [23]. For recognition of directional errors in the terms of false discovery rate (FDR) two variants of false discovery rate (FDR) were introduced in Benjamini et al. [29]: pure directional false discovery rate (pdFDR) and mixed directional false discovery rate (mdFDR), which are the following
It is shown that the FDR is an effective model selection criterion, as it can be translated into a penalty function. Therefore, FDR gives the opportunity to increase the power of the test in general case [30]. Both variants of FDR for directional hypotheses: pdFDR and mdFDR can be expressed by Type III error rates (ERRIII):
Here TIIIERR and KIIIERR are two different forms of Type III error rates, considered by different authors Mosteller, et al. [31]; Kaiser, [9]; Jones, et al. [13] and Shaffer, [14]) and IIISERR is the summary type III error rate ()IIISERR [25].
Here in after, if necessary, let’s ascribe the number of the task related to the considered CBM directly to this abbreviation.
Theorem 1. CBM 1 with restriction level of (7), at satisfying a condition
ensures a decision rule with
less or equal to q i.e. with the condition
Proof. Because of the peculiarity of decision making rule of CBM, alongside of hypotheses acceptance regions there exist the regions of impossibility of making a decision [26,32]. Therefore, instead of condition
of the classical decision making procedures, the following condition is fulfilled in CBM
where imd is the abbreviation of the impossibility of making a decision
Taking into account (17), condition (7) can be rewritten as follows
From here follows that
Let’s denote Then from (18) we have
Taking into account (12), we write
This proves the theorem
Let’s call false acceptance rate (FAR) the following
Restrictions on the conditional probabilities of acceptance of each true hypothesis (Task 2)
To minimize (6) subject to
where Lagrange multipliers are determined so that in conditions (22) equalities takes place.
Theorem 2. CBM 2 with restriction level of (22), at satisfying a condition q, ensures a decision rule with mdFDR (i.e. with IIISERR) less or equal to q, i.e. with the condition
Proof. Taking into account (12), (17), condition
In our opinion these properties of and CBM are very interesting and useful. They bring the statistical hypotheses testing rule much close to the everyday decision-making rule when, at shortage of necessary information, acceptance of one of made suppositions is not compulsory.
The specific features of hypotheses testing regions of the Berger’s test and CBM, namely, the existence of the no-decision region in the test and the existence of regions of impossibility of making a unique or any decision in CBM give the opportunities to develop the sequential tests on their basis [2,36,26,28]. The sequential test was introduced by Wald in the middle of forty of last century [37,38]. Since Wald’s pioneer works, a lot of different investigations were dedicated to the sequential analysis problems (see, for example, Berger, et al. [39]; Ghosh, [40]; Ghosh, et al. [41]; Siegmund, [42]) and efforts to the development of this approach constantly increase as it has many important advantages in comparison with the parallel methods [43].
Application of CBM to different types of hypotheses (two and many simple, composite, directional and multiple hypotheses) with parallel and sequential experiments showed the advantage and uniqueness of the method in comparison with existing ones [24-29,44]. The advantage of the method is the optimality of made decisions with guaranteed reliability and minimality of necessary observations for given reliability. CBM uses not only loss functions and a priori probabilities for making decisions as the classical Bayesian rule does, but also a significance level as the frequentist method does. The combination of these opportunities improves the quality of made decisions in CBM in comparison with other methods. This fact is many times confirmed by application of CBM to the solution of different practical problems [45-47,32,44].
Finally, it must be noted that, the detailed investigation of different statements of CBM and the choice of optimal loss functions in the constrained statements of the Bayesian testing problem opens wide opportunities in statistical hypotheses testing with new, beforehand unknown and interesting properties. On the other hand, the statement of the Bayesian estimation problem as a constrained optimization problem gives new opportunities in finding optimal estimates with new, unknown beforehand properties, and it seems that these properties will advantageously differ from those of the approaches known today.
In our opinion, the proposed CBM are the ways for future, perspective investigations which will give researchers the opportunities for obtaining new perspective results in the theory and practice of statistical inferences and it completely corresponds to the thoughts of the well-known statistician B Efron [48]: “Broadly speaking, nineteenth century statistics was Bayesian, while the twentieth century was frequentist, at least from the point of view of most scientific practitioners. Here in the twenty-first century scientists are bringing statisticians much bigger problems to solve, often comprising millions of data points and thousands of parameters. Which statistical philosophy will dominate practice? My guess, backed up with some recent examples, is that a combination of Bayesian and frequentist ideas will be needed to deal with our increasingly intense scientific environment. This will be a challenging period for statisticians, both applied and theoretical, but it also opens the opportunity for a new golden age, rivaling that of Fisher, Neyman, and the other giants of the early 1900s.”
References
- Berger J O (1985) Statistical Decision Theory and Bayesian Analysis, Springer, NewYork.
- Berger JO, Brown LD, Wolpert RL (1994) A Unified Conditional Frequentist and Bayesian Test for Fixed and Sequential Simple Hypothesis Testing. The Annals of Statistics 22(4): 1787-1807.
- Bernardo JM, Rueda R (2002) Bayesian Hypothesis Testing: A Reference Approach. International Statistical Review 1-22.
- Christensen R (2005) Testing Fisher, Neyman, Pearson, and Bayes. American Statistician 59(2): 121-126.
- Hubbard R, Bayarri MJ (2003) Confusion over Measures of Evidence (p’s) Versus Errors (α’s) in Classical Statistical Testing. The American Statistician 57: 171-177.
- Lehmann EL (1993) The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two?. American Statistical Association Journal, Theory and Methods 88(424): 1242-1249.
- Lehmann EL (1997) Testing Statistical Hypotheses (2nd edn). New York: Springer, USA.
- Moreno E, Giron FJ (2006) On the Frequentist and Bayesian Approaches to Hypothesis Testing, SORT 30(1): 3-28.
- Wolpert RL (1996) Testing simple hypotheses. In: Bock HH, et al. (Eds), Data Analysis and Information Systems (7th Edn), Heidelberg: Springer, pp. 289-297.
- Fisher RA (1925) Statistical Methods for Research Workers, London: Oliver and Boyd, UK.
- Neyman J, Pearson E (1928) On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference. Part I, Biometrica, 20: 175-240.
- Neyman J, Pearson E (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses, Philos. Trans Roy Soc Ser A 231: 289-337.
- Jeffreys H (1939) Theory of Probability, (1st edn). Oxford: The Clarendon Press,UK.
- Berger JO, Wolpert RL (1988) The Likelihood Principle, (2nd edn). (with discussion). IMS, Hayward: CA, USA.
- Bernardo JM (1980) A Bayesian analysis of classical hypothesis testing. Universidad de Valencia, 1: 605-617.
- Delampady M, Berger JO (1990) Lower bounds on Bayes factors for the multinomial distribution, with application to chi-squared tests of fit. Ann Statist 18: 1295-1316.
- Kiefer J (1977) Conditional confidence statement and confidence estimations (with discussion). J Amer. Statist Assoc 72(360): 789-808.
- Bansal NK, Sheng R (2010) Beyesian Decision Theoretic Approach to Hypothesis Problems with Skewed Alternatives. Journal of Statistical Planning and Inference 140: 2894-2903.
- Bansal NK, Miescke KJ (2013) A Bayesian decision theoretic approach to directional multiple hypotheses problems. Journal of Multivariate Analysis 120: 205-215.
- Bansal NK, Hamedani GG, Maadooliat M (2016) Testing Multiple Hypotheses with Skewed Alternatives. Biometrics 72(2):494-502.
- Berger JO (2003) Could Fisher, Jeffreys and Neyman have Agreed on Testing? Statistical Science 18: 1-32.
- Kachiashvili KJ (2003) Generalization of Bayesian Rule of Many Simple Hypotheses Testing. International Journal of Information Technology & Decision Making 2(1): 41-70.
- Kachiashvili KJ (2011) Investigation and Computation of Unconditional and Conditional Bayesian Problems of Hypothesis Testing. ARPN Journal of Systems and Software 1(2): 47-59.
- Kachiashvili KJ (2014) Comparison of Some Methods of Testing Statistical Hypotheses. Part I. Parallel Methods. International Journal of Statistics in Medical Research 3: 174-189.
- Kachiashvili KJ (2014) Comparison of Some Methods of Testing Statistical Hypotheses. Part II. Sequential Methods. International Journal of Statistics in Medical Research 3: 189-197.
- Kachiashvili KJ (2015) Constrained Bayesian Method for Testing Multiple Hypotheses in Sequential Experiments. Sequential Analysis: Design Methods and Applications 34(2): 171-186
- Kachiashvili KJ (2016) Constrained Bayesian Method of Composite Hypotheses Testing: Singularities and Capabilities. International Journal of Statistics in Medical Research, 5(3): 135-167.
- Kachiashvili KJ (2018a) Constrained Bayesian Methods of Hypotheses Testing: A New Philosophy of Hypotheses Testing in Parallel and Sequential Experiments. Nova Science Publishers, Inc., New York, Pp. 456.
- Kachiashvili KJ (2018) On One Aspect of Constrained Bayesian Method for Testing Directional Hypotheses. Biomed J Sci &Tech Res, 2(5). BJSTR.MS.ID.000821. DOI: 10.26717/BJSTR.2018.02.000821
- Kachiashvili GK, Kachiashvili KJ, Mueed A (2012) Specific Features of Regions of Acceptance of Hypotheses in Conditional Bayesian Problems of Statistical Hypotheses Testing. Sankhya: A 74(1): 112-125.
- Kachiashvili KJ, Hashmi MA, Mueed A (2012) Sensitivity Analysis of Classical and Conditional Bayesian Problems of Many Hypotheses Testing. Communications in Statistics-Theory and Methods 41(4): 591- 605.
- Kachiashvili KJ, Hashmi MA, Mueed A (2012) The Statistical Risk Analysis as the Basis of the Sustainable Development. Int J of Innovation and Technol 9(3): 1250024.
- Kachiashvili KJ, Mueed A (2013) Conditional Bayesian Task of Testing Many Hypotheses. Statistics 47(2): 274-293.
- Kachiashvili KJ, Prangishvili AI (2018) Verification in biometric systems: problems and modern methods of their solution. Journal of Applied Statistics 45(1): 43-62.
- Kachiashvili KJ, Hashmi MA, Mueed A (2008) The statistical risk analysis as the basis of the sustainable development. Proceedings of the 4th IEEE International Conference on Management of Innovation & Technology (ICMIT2008), Bangkok, Thailand, 1: 1210-1215.
- Kachiashvili KJ, Hashmi MA (2010) About Using Sequential Analysis Approach for Testing Many Hypotheses. Bulletin of the Georgian Academy of Sciences, 4(2): 20-25.
- Wald A (1947) Sequential analysis. New-York: Wiley, USA.
- Wald A (1947) Foundations of a General Theory of Sequential Decision Functions. Econometrica 15: 279-313.
- Berger JO, Wolpert RL (1984) The Likelihood Principle. Institute of Mathematical Statistics Monograph Series (IMS), Hayward: CA, USA.
- Ghosh BK (1970) Sequential Tests of Statistical Hypotheses. Addison- Wesley, Reading, Massachusetts.
- Ghosh BK, Sen PK (1991) Handbook of Sequential Analysis. Dekker, NY, USA.
- Siegmund D (1985) Sequential Analysis. Springer Series in Statistics. Springer-Verlag, NY, USA.
- Tartakovsky A, Nikiforov I, Basseville M (2015) Sequential Analysis. Hypothesis Testing and Challenge point Detection. Taylor & Francis Group, New York, USA.
- Kachiashvili KJ, Bansal NK, Prangishvili IA (2018) Constrained Bayesian Method for Testing the Directional Hypotheses. Journal of Mathematics and System Science 8: 96-118
- Kachiashvili KJ, Melikdzhanian DI (2006) Identification of River Water Excessive Pollution Sources. International Journal of Information Technology & Decision Making. World Scientific Publishing Company, 5(2): 397-417.
- Kachiashvili KJ, Gordeziani DG, Lazarov RG, Melikdzhanian DI (2007) Modeling and simulation of pollutants transport in rivers. AMM 31: 1371-1396.
- Kachiashvili KJ, Hashmi MA, Mueed A (2009) Bayesian Methods of Statistical Hypothesis Testing for Solving Different Problems of Human Activity. AMIM 14(2): 3-17.
- Efron B (2004) Large-Scale Simultaneous Hypothesis Testing. Journal of the American Statistical Association 99(465): 96-104.