Modern State of Statistical Hypotheses Testing and Perspectives of its Development
Kachiashvili KJ1,2*
1 Faculty of Informatics and Control Systems, Georgian Technical University, Georgia
2 Vekua Institute of Applied Mathematics of the Tbilisi State University, Georgia
Submission: January 17, 2019; Published: March 12, 2019
*Corresponding author: K J Kachiashvili, Faculty of Informatics and Control Systems, Georgian Technical University, Georgia and Vekua Institute of Applied Mathematics of the Tbilisi State University, Tbilisi, Georgia
How to cite this article: Kachiashvili KJ. Modern State of Statistical Hypotheses Testing and Perspectives of its Development. Biostat Biometrics Open Acc J. 2019; 9(2): 555759. DOI: 10.19080/BBOAJ.2019.09.555759
Abstract
A statistical hypothesis is a formalized record of properties of the investigated phenomenon and relevant assumptions. The statistical hypotheses are set when random factors affect the investigated phenomena, i.e. when the observation results of the investigated phenomena are random. The properties of the investigated phenomenon are completely defined by its probability distribution law. Therefore, the statistical hypothesis is an assumption concerning this or that property of the probability distribution law of a random variable. Mathematical statistics is the set of the methods for studying the events caused by random variability and estimates the measures (the probabilities) of possibility of occurrence of these events. For this reason, it uses distribution laws as a rule. Practically all methods of mathematical statistics one way or another, in different doses, use hypotheses testing techniques. Therefore, it is very difficult to overestimate the meaning of the methods of statistical hypotheses testing in the theory and practice of mathematical statistics.
Introduction
A lot of investigations are dedicated to the statistical hypotheses testing theory and practice (see, for example, Berger [1], Berger et al. [2], Bernardo, et al. [3], Christensen, et al. [4], Hubbard, et al. [5]; Lehmann, et al. [6,7], Moreno, et al. [8], Wolpert, et al. [9]) and their number increase steadily. But, despite of this fact, there are only three following basic ideas (philosophies) of hypotheses testing at parallel experiments: the Fisher, [10], the Neyman-Pearson [11,12] and the Jeffreys et al. [13]. They use different ideas for testing hypotheses but all of them are identical in one aspect: they all necessarily accept one of stated hypotheses at making decision despite of existence or absence enough information for making decision with given reliability. The considered methods have well known positive and negative sides. All other existed methods are the particular cases of these approaches taking into account the peculiarities of the concrete problems and adapting to these specificities for increasing the reliability of the decision (see, for example, Berger, et al. [14]; Bernardo, et al. [15]; Delampady, et al. [16]; Kiefer [17]; Bansal, et al. [18]; Bansal, et al. [19]; Bansal et al. [20].
An attempt to reconcile the different points of view of noted philosophies was made in Berger [21], and as a result there was offered a new, compromise method of testing. The method uses the Fisher’s -value criterion for making a decision, the Neyman-Pearson’s statement (using basic and alternative hypotheses) and Jeffrey’s formulae for computing the Type I and Type II conditional error probabilities for every observation result on the basis of which the decision is made.
A new approach (philosophy) to the statistical hypotheses testing, called Constrained Bayesian Methods (CBM), was comparatively recently developed [22-34]. This method differs from the traditional Bayesian approach with a risk function split into two parts, reflecting risks for incorrect rejection and incorrect acceptance of hypotheses and stating the risk minimization problem as a constrained optimization problem when one of the risk components is restricted and the another one is minimized. It generates data-dependent measures of evidence with regard to the level of restriction. In spite of absolutely different motivations of introduction of and CBM, they lead to the hypotheses acceptance regions with identical properties in principle. Namely, in despite of the classical cases when the observation space is divided into two complementary sub-spaces for acceptance and rejection of tested hypotheses, here the observation space contains the regions for making the decision and the regions for no-making the decision (see, for example, Berger [21]; Kachiashvili et al. [35]; Kachiashvili et al. [31]; Kachiashvili, et al. [33]; Kachiashvili, [28,35]). Though, for CBM, the situation is more differentiated than for.
For CBM the regions for no-making the decision are divided into the regions of impossibility of making the decision and the regions of impossibility of making unique decision. In the first case, the impossibility of making the decision is equivalent to the impossibility of making the decision with given probability of the error for a given observation result, and it becomes possible when the probability of the error decreases. In the second case, it is impossible to make a unique decision when the probability of the error is required to be small, and it is unattainable for the given observation result. By increasing the error probability, it becomes possible to make a decision.
In our opinion these properties of and CBM are very interesting and useful. They bring the statistical hypotheses testing rule much close to the everyday decision-making rule when, at shortage of necessary information, acceptance of one of made suppositions is not compulsory.
The specific features of hypotheses testing regions of the Berger’s test and CBM, namely, the existence of the no-decision region in the test and the existence of regions of impossibility of making a unique or any decision in CBM give the opportunities to develop the sequential tests on their basis [2,36,26,28]. The sequential test was introduced by Wald in the middle of forty of last century [37,38]. Since Wald’s pioneer works, a lot of different investigations were dedicated to the sequential analysis problems (see, for example, Berger, et al. [39]; Ghosh, [40]; Ghosh, et al. [41]; Siegmund, [42]) and efforts to the development of this approach constantly increase as it has many important advantages in comparison with the parallel methods [43].
Application of CBM to different types of hypotheses (two and many simple, composite, directional and multiple hypotheses) with parallel and sequential experiments showed the advantage and uniqueness of the method in comparison with existing ones [24-29,44]. The advantage of the method is the optimality of made decisions with guaranteed reliability and minimality of necessary observations for given reliability. CBM uses not only loss functions and a priori probabilities for making decisions as the classical Bayesian rule does, but also a significance level as the frequentist method does. The combination of these opportunities improves the quality of made decisions in CBM in comparison with other methods. This fact is many times confirmed by application of CBM to the solution of different practical problems [45-47,32,44].
Finally, it must be noted that, the detailed investigation of different statements of CBM and the choice of optimal loss functions in the constrained statements of the Bayesian testing problem opens wide opportunities in statistical hypotheses testing with new, beforehand unknown and interesting properties. On the other hand, the statement of the Bayesian estimation problem as a constrained optimization problem gives new opportunities in finding optimal estimates with new, unknown beforehand properties, and it seems that these properties will advantageously differ from those of the approaches known today.
In our opinion, the proposed CBM are the ways for future, perspective investigations which will give researchers the opportunities for obtaining new perspective results in the theory and practice of statistical inferences and it completely corresponds to the thoughts of the well-known statistician B Efron [48]: “Broadly speaking, nineteenth century statistics was Bayesian, while the twentieth century was frequentist, at least from the point of view of most scientific practitioners. Here in the twenty-first century scientists are bringing statisticians much bigger problems to solve, often comprising millions of data points and thousands of parameters. Which statistical philosophy will dominate practice? My guess, backed up with some recent examples, is that a combination of Bayesian and frequentist ideas will be needed to deal with our increasingly intense scientific environment. This will be a challenging period for statisticians, both applied and theoretical, but it also opens the opportunity for a new golden age, rivaling that of Fisher, Neyman, and the other giants of the early 1900s.”
References
- Berger J O (1985) Statistical Decision Theory and Bayesian Analysis, Springer, NewYork.
- Berger JO, Brown LD, Wolpert RL (1994) A Unified Conditional Frequentist and Bayesian Test for Fixed and Sequential Simple Hypothesis Testing. The Annals of Statistics 22(4): 1787-1807.
- Bernardo JM, Rueda R (2002) Bayesian Hypothesis Testing: A Reference Approach. International Statistical Review 1-22.
- Christensen R (2005) Testing Fisher, Neyman, Pearson, and Bayes. American Statistician 59(2): 121-126.
- Hubbard R, Bayarri MJ (2003) Confusion over Measures of Evidence (p’s) Versus Errors (α’s) in Classical Statistical Testing. The American Statistician 57: 171-177.
- Lehmann EL (1993) The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two?. American Statistical Association Journal, Theory and Methods 88(424): 1242-1249.
- Lehmann EL (1997) Testing Statistical Hypotheses (2nd edn). New York: Springer, USA.
- Moreno E, Giron FJ (2006) On the Frequentist and Bayesian Approaches to Hypothesis Testing, SORT 30(1): 3-28.
- Wolpert RL (1996) Testing simple hypotheses. In: Bock HH, et al. (Eds), Data Analysis and Information Systems (7th Edn), Heidelberg: Springer, pp. 289-297.
- Fisher RA (1925) Statistical Methods for Research Workers, London: Oliver and Boyd, UK.
- Neyman J, Pearson E (1928) On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference. Part I, Biometrica, 20: 175-240.
- Neyman J, Pearson E (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses, Philos. Trans Roy Soc Ser A 231: 289-337.
- Jeffreys H (1939) Theory of Probability, (1st edn). Oxford: The Clarendon Press,UK.
- Berger JO, Wolpert RL (1988) The Likelihood Principle, (2nd edn). (with discussion). IMS, Hayward: CA, USA.
- Bernardo JM (1980) A Bayesian analysis of classical hypothesis testing. Universidad de Valencia, 1: 605-617.
- Delampady M, Berger JO (1990) Lower bounds on Bayes factors for the multinomial distribution, with application to chi-squared tests of fit. Ann Statist 18: 1295-1316.
- Kiefer J (1977) Conditional confidence statement and confidence estimations (with discussion). J Amer. Statist Assoc 72(360): 789-808.
- Bansal NK, Sheng R (2010) Beyesian Decision Theoretic Approach to Hypothesis Problems with Skewed Alternatives. Journal of Statistical Planning and Inference 140: 2894-2903.
- Bansal NK, Miescke KJ (2013) A Bayesian decision theoretic approach to directional multiple hypotheses problems. Journal of Multivariate Analysis 120: 205-215.
- Bansal NK, Hamedani GG, Maadooliat M (2016) Testing Multiple Hypotheses with Skewed Alternatives. Biometrics 72(2):494-502.
- Berger JO (2003) Could Fisher, Jeffreys and Neyman have Agreed on Testing? Statistical Science 18: 1-32.
- Kachiashvili KJ (2003) Generalization of Bayesian Rule of Many Simple Hypotheses Testing. International Journal of Information Technology & Decision Making 2(1): 41-70.
- Kachiashvili KJ (2011) Investigation and Computation of Unconditional and Conditional Bayesian Problems of Hypothesis Testing. ARPN Journal of Systems and Software 1(2): 47-59.
- Kachiashvili KJ (2014) Comparison of Some Methods of Testing Statistical Hypotheses. Part I. Parallel Methods. International Journal of Statistics in Medical Research 3: 174-189.
- Kachiashvili KJ (2014) Comparison of Some Methods of Testing Statistical Hypotheses. Part II. Sequential Methods. International Journal of Statistics in Medical Research 3: 189-197.
- Kachiashvili KJ (2015) Constrained Bayesian Method for Testing Multiple Hypotheses in Sequential Experiments. Sequential Analysis: Design Methods and Applications 34(2): 171-186
- Kachiashvili KJ (2016) Constrained Bayesian Method of Composite Hypotheses Testing: Singularities and Capabilities. International Journal of Statistics in Medical Research, 5(3): 135-167.
- Kachiashvili KJ (2018a) Constrained Bayesian Methods of Hypotheses Testing: A New Philosophy of Hypotheses Testing in Parallel and Sequential Experiments. Nova Science Publishers, Inc., New York, Pp. 456.
- Kachiashvili KJ (2018) On One Aspect of Constrained Bayesian Method for Testing Directional Hypotheses. Biomed J Sci &Tech Res, 2(5). BJSTR.MS.ID.000821. DOI: 10.26717/BJSTR.2018.02.000821
- Kachiashvili GK, Kachiashvili KJ, Mueed A (2012) Specific Features of Regions of Acceptance of Hypotheses in Conditional Bayesian Problems of Statistical Hypotheses Testing. Sankhya: A 74(1): 112-125.
- Kachiashvili KJ, Hashmi MA, Mueed A (2012) Sensitivity Analysis of Classical and Conditional Bayesian Problems of Many Hypotheses Testing. Communications in Statistics-Theory and Methods 41(4): 591- 605.
- Kachiashvili KJ, Hashmi MA, Mueed A (2012) The Statistical Risk Analysis as the Basis of the Sustainable Development. Int J of Innovation and Technol 9(3): 1250024.
- Kachiashvili KJ, Mueed A (2013) Conditional Bayesian Task of Testing Many Hypotheses. Statistics 47(2): 274-293.
- Kachiashvili KJ, Prangishvili AI (2018) Verification in biometric systems: problems and modern methods of their solution. Journal of Applied Statistics 45(1): 43-62.
- Kachiashvili KJ, Hashmi MA, Mueed A (2008) The statistical risk analysis as the basis of the sustainable development. Proceedings of the 4th IEEE International Conference on Management of Innovation & Technology (ICMIT2008), Bangkok, Thailand, 1: 1210-1215.
- Kachiashvili KJ, Hashmi MA (2010) About Using Sequential Analysis Approach for Testing Many Hypotheses. Bulletin of the Georgian Academy of Sciences, 4(2): 20-25.
- Wald A (1947) Sequential analysis. New-York: Wiley, USA.
- Wald A (1947) Foundations of a General Theory of Sequential Decision Functions. Econometrica 15: 279-313.
- Berger JO, Wolpert RL (1984) The Likelihood Principle. Institute of Mathematical Statistics Monograph Series (IMS), Hayward: CA, USA.
- Ghosh BK (1970) Sequential Tests of Statistical Hypotheses. Addison- Wesley, Reading, Massachusetts.
- Ghosh BK, Sen PK (1991) Handbook of Sequential Analysis. Dekker, NY, USA.
- Siegmund D (1985) Sequential Analysis. Springer Series in Statistics. Springer-Verlag, NY, USA.
- Tartakovsky A, Nikiforov I, Basseville M (2015) Sequential Analysis. Hypothesis Testing and Challenge point Detection. Taylor & Francis Group, New York, USA.
- Kachiashvili KJ, Bansal NK, Prangishvili IA (2018) Constrained Bayesian Method for Testing the Directional Hypotheses. Journal of Mathematics and System Science 8: 96-118
- Kachiashvili KJ, Melikdzhanian DI (2006) Identification of River Water Excessive Pollution Sources. International Journal of Information Technology & Decision Making. World Scientific Publishing Company, 5(2): 397-417.
- Kachiashvili KJ, Gordeziani DG, Lazarov RG, Melikdzhanian DI (2007) Modeling and simulation of pollutants transport in rivers. AMM 31: 1371-1396.
- Kachiashvili KJ, Hashmi MA, Mueed A (2009) Bayesian Methods of Statistical Hypothesis Testing for Solving Different Problems of Human Activity. AMIM 14(2): 3-17.
- Efron B (2004) Large-Scale Simultaneous Hypothesis Testing. Journal of the American Statistical Association 99(465): 96-104.