Latent Class Analysis of Physical Activity and Mortality in U.S. Adults

Physical activity (PA) is recommended for all U.S. individuals for its protection against and treatment of chronic disease [1-4] as well as its relationship with increased longevity [57] and increased health-related quality of life [8-9]. Current U.S. guidelines for PA recommend all adults accumulate 150+ minutes each week of moderate-intensity PA or an equivalent amount of combined moderate and vigorous-intensity PA [10]. Furthermore, different types of PA selected, independent of duration, has been shown to affect health outcomes in adults [11]. Given these known relationships between PA and health, it is still commonly understood that PA is a complex behavior that is generally assessed with varying amounts of measurement error [12]. This is true of both subjective [13,14] and objective methods [15]. Therefore, a need exists for advanced methods that may be able to measure complex behavior such as PA. Latent class analysis (LCA) is a statistical technique used to identify unobservable group membership using a set of observed variables [16,17].


Introduction
Physical activity (PA) is recommended for all U.S. individuals for its protection against and treatment of chronic disease [1][2][3][4] as well as its relationship with increased longevity [5][6][7] and increased health-related quality of life [8][9]. Current U.S. guidelines for PA recommend all adults accumulate 150+ minutes each week of moderate-intensity PA or an equivalent amount of combined moderate and vigorous-intensity PA [10]. Furthermore, different types of PA selected, independent of duration, has been shown to affect health outcomes in adults [11]. Given these known relationships between PA and health, it is still commonly understood that PA is a complex behavior that is generally assessed with varying amounts of measurement error [12]. This is true of both subjective [13,14] and objective methods [15]. Therefore, a need exists for advanced methods that may be able to measure complex behavior such as PA. Latent class analysis (LCA) is a statistical technique used to identify unobservable group membership using a set of observed variables [16,17].

Juniper Online Journal of Public Health
PA behavior can be regarded as an unobservable (latent) behavior, in that it is too complex to measure precisely among free-living populations. Thus, latent variables can be indirectly measured using a number of related observed variables [18]. LCA, then, is a viable statistical method that aims to categorize objects into different groups where objects within each group are similar in terms of their responses to the observed variables while objects in other groups are as different as possible from other group objects [19]. More specifically, LCA has the ability to use scale items from a PA assessment and create latent groups of similar respondents that differ in the PA trait across groups. Furthermore, many large national surveys contain questions regarding PA behavior and can be used to form latent classes. Therefore, the purpose of this study was to use LCA with PA indicators from a large national health survey to predict allcause mortality in U.S. adults.

Participants and Design
The2001-02 National Health and Nutrition Examination Survey (NHANES) was used for this research. NHANES is a large national survey representing all non institutionalized U.S. citizens. NHANES is designed to assess health and nutrition information with datasets organized by category: demographics, dietary, examination, laboratory, questionnaire, and limited access. The National Centre for Health Statistics (NCHS) is responsible for linking mortality data to NHANES participants using a probability matching procedure [20]. The most recent mortality follow-up ending this past December 31, 2011. Only participants who were 18+ years of age and eligible for mortality linkage were used in the analysis.

Measures
Four PA variables were used in this study: home/yard (HPA), moderate recreational (MPA), vigorous recreational (VPA), and muscle strengthening (MSPA). The four PA variables (HPA, MPA, VPA, and MSPA) were determined from a series of questions asking respondents if they participated in that specific type of activity [20][21][22]. Each PA variable was dichotomized to represent participation (yes/no). HPA was assessed by the following question: "Over the past 30 days, did you do any tasks in or around your home or yard for at least 10 minutes that required moderate or greater physical effort? By moderate physical effort I mean, tasks that caused light sweating or a slight to moderate increase in your heart rate or breathing. [Such as raking leaves, mowing the lawn or heavy cleaning.]" MPA was assessed by the following question "Over the past 30 days, did you do moderate activities for at least 10 minutes that cause only light sweating or a slight to moderate increase in breathing or heart rate? Some examples are brisk walking, bicycling for pleasure, golf, and dancing." VPA was assessed by the following question "Over the past 30 days, did you do any vigorous activities for at least 10 minutes that caused heavy sweating, or large increases in breathing or heart rate? Some examples are running, lap swimming, aerobics classes or fast bicycling." Finally, MSPA was assessed by the following question: "Over the past 30 days, did you do any physical activities specifically designed to strengthen your muscles such as lifting weights, push-ups or sit-ups?" Those respondents answering "yes" to either question were considered participating in that type of PA. Finally, five covariates were used for PSM: age, sex, race, and income.

Statistical Analysis
PROC LCA was used to determine distinct latent groups of PA behavior among U.S. adults [23,24]. LCA model fit was determined using the log-likelihood (G2) chi-square statistic, Akaike information criterion (AIC), and Bayesian information criterion (BIC). AIC is a measure of difference between the data and model likelihood functions. BIC is similar to AIC, however, BIC imposes a larger penalty (2 times the number of parameters add to AIC as opposed to log (N) times the number of parameters added to BIC) for increasing the number of model parameters. Both AIC and BIC (more so BIC) penalize for more complex models, with lower values indicating a relatively better model fit [25][26][27]. Prevalence estimates with their 95% confidence intervals (CIs) were computed for PA types, overall and across demographic variables. PA estimates were also computed across newly found latent classes and differences in prevalence tested using the chi-square statistic. PROC SURVEYPHREG was used to run Cox proportional hazards regression to model the effects of latent PA on mortality while controlling for age, sex, race, and income. SAS version 9.4 was used to account for the sampling design [28][29][30]. All significance levels were set to p=.05.    Note: N=5,839. P is # of parameters. LL is the log-likelihood. AIC is Akaike information criterion (AIC=G 2 +2P). BIC is Bayesian information criterion (BIC=G 2 +log(N)P). G 2 is the LL chi-square fit statistic. df is degrees of freedom. p is the p-value for G 2 . Tests of fit unavailable for negative df. A 3 class model was the selected LCA model. The intercept only model LL is -13885.53.  Table 4 shows the conditional probabilities associated with the 3-class LCA model. Each class showed a distinctly clear latent PA subgroup. That is, class I consisted of those not likely to report any forms of PA. Class II consisted of those more likely to report all four forms of PA. And class III consisted of those more likely to report HPA and MPA only. Table 5 displays distributions of latent PA class by demographic categories. More participants were categorized in class III than the other two (p<.001). More males were categorized in class II and class III, whereas, more females were categorized in class I (p<.001). More younger participants were categorized in class II, as compared to their counterparts (p<.001). More white participants were categorized in class III, as compared to their counterparts (p<.001). And finally, more participants in the higher income groups were categorized in both class II and III, as compared to their counterparts (Tables 6,7) display results of the combined LCA and mortality analyses. Note: Class I consisted of those not likely to report any forms of PA. Class II consisted of those more likely to report all four forms of PA. Class III consisted of those more likely to report HPA and MPA only. p value is for the Rao-Scott chi-square statistic. Note: Class I consisted of those not likely to report any forms of PA. Class II consisted of those more likely to report all four forms of PA. Class III consisted of those more likely to report HPA and MPA only. p value is for the Rao-Scott chi-square statistic. Note: HR is hazard ratio. CI is confidence interval. Class I consisted of those not likely to report any forms of PA. Class II consisted of those more likely to report all four forms of PA. Class III consisted of those more likely to report HPA and MPA only. p value is for t-statistic testing the HR. Adjusted I model is adjusted for age and sex. Adjusted II model is fully adjusted for age, sex, race/ethnicity, and income.

Juniper Online Journal of Public Health
A total of 54,477 person-years of follow-up was observed with 864 deaths. Table 6 displays distribution of latent PA by mortality status. Mortality rates were lowest for class II (4.4%; 95% CI: 2.9-6.0) and class III (7.6%; 95% CI: 6.4-8.6). Table 7 displays hazards associated with latent PA. In the unadjusted model, adults in class III (Hazard Ratio (HR) =0.36, 95% CI: 0.31, 0.43) and class II (HR=0.21, 95% CI: 0.14, 0.32) were at less risk of all-cause mortality as compared to their class I counterparts.
The age-sex adjusted model remained significant with adults in class III (HR=0.41, 95% CI: 0.35, 0.49) and class II (HR=0.39, 95% CI: 0.27, 0.57) at less risk of all-cause mortality as compared to their class I counterparts. Finally, the fully adjusted model remained significant with adults in class III (HR=0.43, 95% CI: 0.35, 0.53) and class II (HR=0.41, 95% CI: 0.29, 0.59) at less risk of all-cause mortality as compared to their class I counterparts.

Discussion
The purpose of this study was to first find a best fitting LCA model using four observed PA variables from a large national health survey. Results from LCA determined that a 3-class latent model fit the data best. The first group (class I) was made-up of respondents not likely to endorse any of the four PA variables (HPA, VPA, MPA, and MSPA). Thus, this group of individuals would be considered largely inactive. The second group (class II) was made-up of respondents more likely to endorse all four PA variables. Thus, this group would be considered highly active and possibly even structured exercisers. Finally, the third group (class III) was made-up of respondents more likely to endorse only HPA and MPA. This group would be considered moderately active and possibly even lifestyle or leisure participants of PA. The weighted prevalence of these classes at baseline are consistent with known distributions of physical inactivity and known distributions of adults meeting PA guidelines [31,32].
The second purpose of this study was to use the newly constructed latent PA classes to predict all-cause mortality in U.S. adults using a representative sample. Results clearly showed a dose-response relationship in latent PA and mortality. Specifically, mortality rates were lowest in class II participants, followed by a significantly and higher rate in class III participants, Juniper Online Journal of Public Health followed by a significantly and even higher mortality rate in class I participants. These findings are also consistent with previous findings, where adults participating in moderate-to-vigorous PA have been shown to be at lower risk of mortality as compared to their less active counterparts [33][34][35]. A unique aspect of this current study is its use of LCA to develop different classes of homogenous participants, different in their PA behavior, where other methods have provided less than optimal results. Although using LCA to develop latent PA classes is novel, it is not unheard of in the PA literature. LCA has been successfully used to develop latent groups regarding food and PA proximity [36], PA patterns [37], diet and PA behavior [38], PA, sleep, and sedentary behavior [39], as well as accelerometer-determined latent PA [40]. This study has limitations worth discussing. One limitation is the use of self-reported PA behavior at baseline, as opposed to the use of a more objective method (e.g., accelerometers). This limitation may introduce a certain amount of error in classifying participants in terms of their endorsement of each of the four indicator variables. Although this fact should be considered, it however, should not be viewed as serious as if this study used self-reported items to measure duration and intensity of PA.
As a reminder, this study used self-reported variables that were only concerned with whether a participant engaged in a certain "type" of activity (i.e., HPA, VPA, MPA, and MSPA). Therefore, PA mis classification in this study may have been less severe as compared to other studies that aimed to more precisely measure PA. Another limitation is the use of baseline PA as an indirect predictor in a prospective study. That is, this study had no means of assessing changes in PA across the observational period. This fact is additionally true for all covariates used in model adjustments. Therefore, it is possible that some participants changed their behavior and/or changed their demographic status over the course of the study period. Thus, the findings in this study should be viewed with caution before considering their implications.

Conclusion
Results from this study indicate that 3 latent classes of PA behavior exist among U.S. adults. Furthermore, latent classes of PA strongly predict all-cause mortality in U.S. adults. Health promotion specialists should consider latent PA classes as a means of marketing in physical activity interventions aimed at increasing longevity.