Elsayir HA

doi:10.19080/BBOAJ.2024.11.555824

Research Article

Estimating the Sample Size for Epidemic and Medical Research

Elsayir HA¹ and Ibrahim Hassan Alkhairy^2*

¹Department of Mathematics, Al-Qunfudah University College, Umm Al- Qura University and Omdurman Islamic University (Sudan), Mecca, Saudi Arabia

²Department of Mathematics, Al-Qunfudah University College, Umm Al-Qura University, Mecca, Saudi Arabia

Submission: May 05, 2024; Published: July 05, 2024

*Corresponding author: Ibrahim Hassan Alkhairy, Department of Mathematics, Al-Qunfudah University College, Umm Al-Qura University, Mecca, Saudi Arabia, Email: ihkhairy@uqu.edu.sa

How to cite this article: Elsayir HA, Ibrahim Hassan A. Estimating the Sample Size for Epidemic and Medical Research. Biostat Biom Open Access J. 2024; 11(5): 555824. DOI: 10.19080/BBOAJ.2024.11.555824

Abstract

This study aims at developing a method of estimating an optimal sample size for use in health and medical research and studies. The paper is designed as a tool that a researcher could use in planning and conducting good quality research and gives a discussion of various aspects of sample size consideration in medical research. The paper also covers the essentials in calculating power and sample size for a variety of applied study designs. Sample size computation for survey type of studies, observation studies and experimental studies based on means and proportions or rates, sensitivity - specificity tests for assessing the categorical outcome are presented in detail. Recently, considerable interest has been focused on medical research after the beginning of COVIT19 pandemic. The resulting literature is scattered over many sources. This paper aimed at giving some contributions in this field.

Keywords: Clinical significance; Confidence interval; Research design; Sample size estimation; Statistical power; Type 1 and Type 2 error

Introduction

The sample size is the number of observations or specimens required in a study. A too-large sample is merely a waste of resources and time and on the other hand, too small a sample fails to produce conclusive and reliable results, N. Gopi Chander [1]. Therefore, depending on the study design and the outcome, which is estimated before the start of the study, a researcher needs to estimate the optimum sample size by the scientific method to produce reliable results, which can serve as a strong foundation for evidence-based practices. The despotic or inadequate calculation of sample size can affect the research design and its significance. The larger size can lead to ethical concerns, time-wasting, and financial costs, and a smaller sample size influences the power of the study. Very recently, it was observed that many medical-related researchers were rushing to conduct their research on COVID-19 and related diseases research and they have to meet the challenge of necessary statistical considerations regarding sample size calculation to provide essential information concerning sample size determination, including the level of significance, the desired power, and the estimated effect size to achieve the desired research power(see for instance [2-4]). Also, much literature

about the topic can be found in Russel [5], Bellera et al. [6] Manski [7] and Johnston [8].

It has been observed that many published research lacks clinical relevance and lacks clarity about the sample size estimation as well as less power due to inappropriate sample size, see [9-12]. Sample size estimation is also essential to know the feasibility of study in terms of required cost and time. Paul H. Lee [9] reviewed some research trials about COVID-19 which were published between Jan. 1, 2020, and Mar. 25, 2020, and indexed in PubMed, and assessed the quality of their sample size calculation. He identified a total of 374 articles and reached to conclude that, in general, the quality of sample size calculation was not acceptable. Some of these studies did not justify the sample size, others have problems and mistakes concerning Cohen’s d effect size and the state of the null hypothesis assumption of the control group. For instance, in Gautret et al. [10] the authors only reported the effect of the treatment (hydroxychloroquine) group (“Assuming a 50% efficacy of hydroxychloroquine in reducing the viral load at day 7”) but the effect of the control group was missing. However, in the time of rapid disease outbreak such as COVID-19, researchers are working around the clock to examine the effectiveness of potential treatments, and inappropriate sample size calculation will lead to adverse consequences.

The results showed that the method for attaining the desired precision of expected width provides satisfactory results only when sample sizes are large. Arif Habib et al. [13], Manski [14], Manski [7] articles addressed the choice of smallest effect, sample size with various designs, the effect of validity and reliability of dependent and predictor variables, sample size for comparison of subgroups and individual differences and responses, sample size when adjusting for subgroups of unequal size. Some practical guidelines for effective sample size determination were found in Lenth [15]. Hopkins (2018) presented a spreadsheet using two new methods for estimating sample size for a study designed to make an inference about real-world significance, which requires interpretation of the magnitude of an outcome, based on acceptable uncertainty defined either by the width of the confidence interval or by error rates for a clinical or practical decision arising from the study. Aimed to develop and present accurate and strict approaches for sample size computation using a real-world case study, Karissa M. et al. (2019) presented formal guidance approach on sample size calculations for retrospective burden of illness studies, which was designed for practical application in real-world review studies. Sharma et al. [16] article covered different formulas of sample size calculation for different types of variables measured in distinct study designs, namely descriptive, epidemiological, comparative, and interventional research studies.

Aimed to develop and present accurate and strict approaches for sample size computation using a real-world case study, Karissa M.et al. (2019) presented a formal guidance approach on sample size calculations for the retrospective burden of illness studies, which was designed for practical application in real-world review studies. Karissa M. et al. (2019) presented sample size formulae for parameters that are of frequent interest in the context of a burden of illness study, see also Johnston KM [8]. Sharma et al. [16] article covered different formulas of sample size calculation for different types of variables measured in distinct study designs, namely descriptive, epidemiological, comparative, and interventional research studies. Althubaiti ( 2023) also, introduced a simple practical guide for health researchers aim to help researchers and health practitioners interested in quantitative research.

The rest of this paper is organized as follows: Chapter 2 is devoted to sample size approaches, formulas and calculation procedures, and the relation with confidence level and power analysis. Numerical illustrations and examples have been done in chapter 3, followed by results discussion in chapter 4,while chapter 5 is devoted to conclusions that have been reached in this research.

Methodology of Sample Size Estimation

Sample Size Approaches

A popular approach to determining sample size involves studying the power of a test of hypothesis including specifying a hypothesis and significance test on a parameter, specifying the effect size for the scientific interest, obtaining historical parameter estimates to be used to compute the power function of the test and specifying a target power value as an objective value of the test, see Russel [5].

Several equations are used to determine the minimum number of subjects that need to be included in a study to have sufficient statistical power to detect an effect in medical research. statistical power is determined by various variables, such as the variance, and treatment effect size, see Morris(2012). For instance, the clinically considered important difference is determined by clinical experience. The smaller the clinically important difference, the more difficult it will be to prove statistically, and the larger the sample size necessary.

In each test of hypothesis, two errors can be committed, a Type I error which refers to the situation where we incorrectly reject H₀a when in fact it is true, whereas the second type is called a Type II error and is defined as the probability we do not reject H₀a when it is false. In hypothesis testing, we usually focus on power, which is the probability that a test correctly rejects a false null hypothesis. A good test is one with a low probability of committing a Type I error (i.e., small α ) and high power (i.e., small β, high power).Suppose we want to test the following hypotheses at α=0.05: H₀a: μ = 90 versus H₁: μ ≠ 90. Suppose a sample of size n=100 is selected, and the standard deviation of the outcome is σ =20. Then a test statistic is computed and compared to an appropriate critical value. If the null hypothesis is true (μ=90), then we are probably to get a sample with mean close in value to 90 and it is likely to observe any sample mean as shown in (Figure 1) under H₀.

To construct the decision rule for our test of hypothesis, we choose critical values based on α=0.05 and a two-tailed test. So, the decision rule is as follows: Rejection area for test H₀a: μ = 90 versus H1: μ ≠ 90 at α =0.05. The areas in the two tails of the curve in (Figure 2) represent the likelihood of a Type I Error, α= 0.05.

Now, if the alternative hypothesis, H1, is true (i.e., μ ≠ 90) ,i.e. ,the mean = 94. The (Figure 3) shows the distributions of the sample mean under H₀a. The sample mean values are shown along the horizontal axis (Figure 3).

If the actual mean is 94, then the H₁ is true. For the test, α is set at 0.05 and reject H₀a if the observed sample mean exceeds 93.92 (see the upper tail of the rejection area). The critical value (93.92) is exhibited by the vertical line. The probability of a Type II error is denoted β, and β = P(Do not Reject H₀a | H₀a is false), β is shown in the (Figure 3) (where we do not reject H₀a). Power, ( 1- β is shown in the figure as the area under the rightmost curve (H₁) to the right of the vertical line (where we reject H₀a). Note that β and power are related to α, the variability of the outcome and the effect size. From the figure above we can see what happens to β and power if we increase α.The (Figure 4) shows the same structure for the case where the mean under the alternative hypothesis is 98,where distribution of xunder H₀a: μ = 90 and under H₁: μ = 98.

It is observed that there is much greater power when there is a greater difference between the mean under H₀a as compared to H₁ (i.e., 90 versus 98). A statistical test is much more probable to reject the H₀a in favor of the H₁ if the true mean is 98 than if the true mean is 94. In this situation, it is seen that there is little overlap in the distributions under the H₀a and H₁ hypotheses.

The Formulas for Sample Size Calculations

The formulas for calculating the required sample size based on the nature of the population data, whether the data collected is to be of a categorical or quantitative nature require knowledge of the variance or proportion and a determination to the maximum desirable error, as well as the acceptable type I error risk (e.g., confidence level). It is possible to construct a table that suggests the optimal sample size , given a population size, a specific margin of error, and a desired confidence interval, see Harza A [17].

Sample Size Calculations for Categorical Outcomes (e.g., Treatment Patterns):

When considering treatment distributions in a population, with a binomial distribution assumption of (n, p), in which the n stands for the sample size and p represents the probability of receiving the treatment, the following formulas are hold (Karissa M. et al. (2019), Arif Habib et al. [13], and Sharma et al. [16]):

For cost estimation, based on Eqs. (2) and (3), assuming that an estimate for the standard deviation is not available, an estimate of cv can instead be used. In practice, a range of possible values can be considered, based on cost and any a priori knowledge regarding the population for health resource utilization , expected range of disease severity, and care vs. high-cost acute treatment such as inpatient stays.

Estimation of Sample Size for Cross-sectional or Descriptive Research Studies

These studies or surveys are generally conducted to find out, observe, describe, and document aspects of a situation as it naturally occurs. It is not used to identify the causation of something, such as a reason for an epidemic. A researcher might collect cross-sectional data on past alcohol habits and current diagnoses of liver disease, for example.

Following Richard, A [18], Bellera et al. [6], and Sharma SK [16] the sample size calculations were given below:

Sample size in case data is on nominal/ordinal scale and proportion is one of the parameters:

Sample Estimation for Case–control Studies

It is a study that determines the cause and effect to see whether exposure is correlated with an outcome or not. By way of explanation, it determines wherever an exposure is correlated with an outcome (i.e., disease or condition of interest). It is a type of observational study in which commonly assumed causation is studied among two groups differing in outcome. For example, a case-control research to find out the relationship between alcohol and liver disease.

Example I: Sample size, when proportion is parameter of the study or data are on nominal/ordinal scale:

Sample Size Estimation for Experimental Studies

Experimental studies or randomized controlled trials are the studies in which researcher artificially manipulates variables under the study. Randomization and control group are important aspect in these types of studies. In this investigator provides intervention and study its effect and compare in experiential and control group.

Example I: Sample size to rule out the difference (effect size) among two groups (based on difference in proportion or for dichotomous nominal/ordinal variables)

Sample size to rule out the difference (effect size) among two groups (based on difference in the mean or for continuous variables).

Estimates of the true prevalence of Covid-19 can be made by random sampling in the wider population. Ola Brynildsrud [19] used simulations to explore confidence intervals of prevalence estimates under different sampling intensities and degrees of

sample pooling and based on simulation of the effect sample pooling on prevalence estimates under different settings for true prevalence . Starting by generating a population of p individuals and then let everyone have p probability of being infected at sampling time. If n is the number of patient samples collected from the population, and the number of patient samples that are pooled into a single well is denoted by k, then the total number of pools are thus .

Pooled sampling can be used to efficiently assert freedom from

disease with a certain probability. If the population is free from the disease, then we find no true positive specimen in our sampling. Ola Brynildsrud [19] calculated how many samples we need to take from a population with prevalence to ensure that the probability of sampling p at least one single positive patient is α or higher. The needed number using the formula of Christensen and Gardner [20] is:

Where n:number of patients samples from population, θ: specified test value and η is sensitivity value(for example 0.95).

For tests with perfect specificity, we do not have to worry about false positives, and if any pools come out as positive, we classify the population as not free from disease. The formula of Christensen and Gardner [20] can be expanded to the case with pooled sampling:

Numerical Analysis:

There are different equations that can be used to calculate sample size and confidence intervals depending on factors such as whether the standard deviation is known or smaller samples (n<30) are involved, among others. Most commonly however, the population is used to refer to a group of people, whether they are the number of patients in a hospital, the number of patients within a certain age group of some geographic area, or the number of patients in a hospital at any given time. Based on different formulas, we present the recommended sample size and its relationships between related factors.

The Sample Size Calculator uses the following formulas:

2. n (with finite population correction) =

Where:

n is the sample size; N is the population size.

z is the z-score associated with a level of confidence, p is the sample proportion, expressed as a decimal,and e is the margin of error, expressed as a decimal.

Numerical Example:

Let’s say we want to calculate the proportion of COVIT19 patients who have been discharged from a given hospital who are happy with the level of care they received while hospitalized at a 90% confidence level of the proportion within 7%. Then we ask :What sample size would we require?

The sample size (n) can be computed using the following formula:

Margin of Error vs. Sample Size:

Margin of error indicates the extent to which the outputs of the sample population are reflective of the overall population. The lower the margin of error, the nearer the researcher is to have an accurate.

Example:

Given: Confidence level (α),Margin of Error (E): %,Population Proportion (p): %.Population Size (N) (optional).The sample size (n) is calculated according to the formula:

Where: z = 1.96 for a confidence level (α) of 95%, p = proportion (expressed as a decimal), N = population size, E= margin of error.

z = 1.96, p = 0.4, N = 1000, E = 0.05

The sample size (with finite population correction) is equal to 269.

(Table 1) presents the results of some calculations for sample size for a given confidence interval and margin of error, which may be used to determine the appropriate sample size for almost any study. It is noticed that the sample size is larger for a lower margin of error or a higher level of confidence. The margin of error depends on the size and variability of the sample. Naturally, the error will be smaller if the sample size (n) is large, or the variability of the data (Standard deviation) is less. The sample sizes in the following tables presume that the attributes being measured are distributed normally or nearly so. If this assumption is violated, then the whole population must be surveyed.

The level of confidence of a sample is expressed as a percentage and describes the extent to which you can be sure it is representative of the target population. For example, a 95% confidence interval does not mean that 95% of the sample data lie within that interval. A confidence interval is not a range of plausible values for the sample, rather it is an interval estimate of plausible values for the population parameter. The proper understanding of CI is not that simple. The true value of the population parameter is fixed, while the width of the 95% CI based on a random sample will also vary randomly. If repeated random samples of equal size are selected from the population, we will get a corresponding number of 95% CI values, only 95% of them can be expected to combine the population parameter value. (Table 2) shows the required sample size for detecting differences from 0.10% to 50%, with 90% confidence and 80% power and conversion rates around 5%.(10% increase for compensation).

In medical research, generally, the researcher should find the optimal sample size by plotting power as a function of effect size and sample size to avoid wasting resources. (Figure 5)exhibits the total sample size and power of the test for difference between two dependent means with effect size d=0.3. The effect size represents the lowest difference that would be of significance. It could be the difference in cure rates, or a standardized mean difference or a correlation coefficient. As effect size increases, the type II error decreases. For a specific power, ‘small effects’ require greater sample size than ‘large effects’. (Figure 5) shows that an increase in sample size yields greater power. The sample size has an indirect effect on power because it affects the measure of variance used to calculate a test statistic (t-test).Since the power of the test to be calculated involves comparison of sample means, one would be more interested in the standard error(the average difference in sample values)than standard deviation or variance. When n, sample size, is large a lower standard error would have been achieved than when n is small. However, when N, the population size is large a smaller beta region would have been achieved than when n is small. The relationship between α and β using t value is shown in (Figure 6) using t distribution. The figure shows the change in β and power if α is increased. In general, when the α level, the effect size, or the sample size increases, the power level increases.

(Figure 7) shows the two tails t tests correlation plot of α and error probability(point biserial model), where the total sample size values are 10,20 and 30, power at 0.9997392.Considering the alternative hypothesis (H₁), choose a region of rejection such that the probability of observing a sample value in that region is less than or equal to α when accepting H₀a. If the obtained sample statistic value falls within the rejection region, the decision is made to reject the H₀a. If α is set at 5%, this can be interpreted that in 5%, or one in twenty, the data indicate that “something” exists, while in fact, it does not.

As observed in (Figure 8), the t test correlation was used ( point biserial model) for post-hoc power analysis, given α, sample size and effect size, where effect size (Es(ρ)= 0.3),α=0.05 and the total sample size equals n =300,then power =0.9997392.

The (Figure 9) illustrates that as effect size(ρ) increases, the error prob(α) decreases.

When planning a clinical study, the resulting sample size might be too large while the possible resources to conduct such a large study is limited, or ethical reasons may prevent enrolling this many subjects. Reducing the sample size usually involves some compromise, such as accepting a small loss in power.

Evaluating a statistical power of existing medical study:

Let us consider two study groups each received different treatments, Continuous(means).If the primary endpoint was binomial-only two possible outcomes) If the primary endpoint was binomial-only two possible outcomes). E.g., mortality (dead/not dead),pregnant(pregnant/not). Consider the following inputs and after calculating ,the results as shown below:

Statistical Parameters

(Figure 10)

p1, p2 = proportion (incidence) of groups #1 and #2

= absolute difference between two proportions

n1 = sample size for group #1,n2 = sample size for group #2

α = probability of type I error (usually 0.05)

z = critical Z value for a given α or β

K = sample size ratio for group #2 to group #1

()Φ = function converting a critical Z value to power.

Post-hoc power analysis procedure has been criticized as a means of interpreting negative study results. Because post-hoc analyses are only calculated on negative trials (p ≥ 0.05), therefore, the analysis will gain a low post-hoc power result, which may lead to misinterpretation as the trial having inappropriate power, hence, instead, 95% CI may be a more appropriate method of calculating statistical power.

Finally, the calculation of the adequate sample size health surveys and studies involves statistical procedures as well as clinical or practical considerations and requires teamwork efforts (also includes biostatisticians) to determine the sample size that will address the research question of interest with adequate precision or power to obtain results that are clinically significant and meaningful [21-26].

Discussion

The formula for sample size calculation alters with the type of study designs. It should be known that all the sample estimates presented represent the largest possible sample size values for the desired level of confidence. Some factors that affect the width of a confidence interval include size of the sample, confidence level, and variability within the sample. As our sample size increases, the confidence in our estimate increases, and then we have greater precision attainable. Higher sample size permits the investigator to increase the significance level of the feedback, since the confidence of the results are potential to increase with higher sample size. It is worthwhile to recall that the confidence interval concept was used to give an answer to an issue in statistical inference results obtained from data that represent randomly selected part of a population. A 95% confidence interval is often used in biological sciences. A much higher level is usually used in the physical sciences, such as engineering field to provide a higher level of precision and eliminate the risks of manufacturing poor-quality products. This can hold for sensitive medical research. Trivial errors in formula, pooling, statistical baseline values, study design, and the outcome measures can lead to erroneous estimation with a great influence on the external validity of the study. In medical research, it is essential sometimes to consider all vital issues, and about 10% of additional samples can be considered to the computed sample size for various consideration. When determining optimal sample size in medical research, it is important to consider any subgroups of interest, and either target such subgroups directly in sampling strategy or account for expected sample size needs if only the whole population is to be investigated.

Conclusion

An optimal sample size for use in epidemic and medical research studies was introduced in this study. A tool that a researcher could use in planning and conducting good quality research is presented and a discussion of various aspects of sample size consideration in medical research is intensively covered, in addition to the essentials in calculating power and sample size for a variety of applied study designs. Sample size computation for survey type of studies, observation studies and experimental studies based on means and proportions or rates, for assessing the categorical outcome are also presented. Enough details such as the power, significance level, mean or rate for the control group, minimal detectable difference, variance, and dropout rate should be clearly explained in this study. Any other factors that formed the basis of the sample size calculation should also be included. Recently, considerable interest has been focused on medical research after the beginning of COVIT19 pandemic. The resulting literature is scattered over many sources. Therefore, the paper aimed at giving some contributions in this field. Hence, to improve the quality of the sample size calculation of COVID-19 trials and related topics research, it is strongly suggested that all research teams should include a statistician or invite a statistician to evaluate the appropriateness of the sample size calculation. Eventually, the method for estimating sample size in epidemic or any health or medical study should be explained clearly with sufficient detail to permit its use in other protocols later.

BBOAJ.MS.ID.555824

Our Media Partner

BBOAJ Menu

Useful Links

Downloads

Estimating the Sample Size for Epidemic and Medical Research

Elsayir HA¹ and Ibrahim Hassan Alkhairy^2*

Abstract

Introduction

Methodology of Sample Size Estimation

Numerical Analysis:

Discussion

Conclusion

References

Member In:

BBOAJ.MS.ID.555824

Our Media Partner

BBOAJ Menu

Useful Links

Downloads

Estimating the Sample Size for Epidemic and Medical Research

Elsayir HA1 and Ibrahim Hassan Alkhairy2*

Abstract

Introduction

Methodology of Sample Size Estimation

Numerical Analysis:

Discussion

Conclusion

References

Member In:

Elsayir HA¹ and Ibrahim Hassan Alkhairy^2*