Data for the present study were from the population-based Northern Finland Birth Cohort 1966 study (NFBC1966). NFBC1966 is a life-course study involving participants whose dates of birth were expected to be in 1966 in Finland’s two northernmost provinces, Oulu and Lapland (n=12,058, 96.3% of all live births in the study area). The present cross-sectional study included NFBC1966 cohort members who participated in the latest follow-up at age 46 and agreed to wear accelerometers for device-based physical activity measurements (21). A total of 10,321 NFBC1966 cohort members (85.6% of all cohort members) were alive in Finland in 2012 and were invited to the follow-up, of which 5,621 (46.6% of all cohort members and 54.4% of those who were invited) participated and wore accelerometers (Fig. 1). With respect to the measurement tools/techniques, the collected data can be categorized into four: self-reported measures, clinical measures, objective built and natural environmental measures, and objective physical activity measures.
Questionnaires and measurements
Questionnaires
A postal questionnaire was sent to all living cohort members with known addresses. The questionnaire included items on social background, frequency and type of habitual exercises, physical and psychological health and well-being, and work–life and socioeconomic situation. In addition, health-related behaviors were assessed by a separate questionnaire, the Quality Of Life Questionnaire (15D©), to rate health-related quality of life (22). Another additional separate survey was used to address opinions and experiences, covering questions from the Temperament and Character Inventory (TCI) questionnaire (23). The temperament and personality trait scores were then composed based on the responses to the items of the TCI questionnaire. More details on the self-reported measures can be found elsewhere (24).
Clinical examination and measurement of physical activity
Participants were also invited to attend a clinical examination. The clinical examinations included measurement of anthropometry, body composition, and cardiorespiratory fitness. Participants’ height, weight, blood pressure and waist-hip ratio were measured and BMI (body mass index) calculated. Participants’ body composition was measured with bio-impedance measurement (InBody720, InBody, Seoul, Korea). A static back muscle strength test (Biering-Sorensen trunk extension test) was performed to evaluate physical performance. A submaximal four-minute single-step test during which heart rate was continuously monitored was performed to assess cardiorespiratory fitness. Further details on the clinical examination protocol and measures are presented elsewhere (25,26).
Objective measurement of physical activity was initiated during clinical examination using a wrist-worn accelerometer (Polar Active, Polar Electro Oy, Kempele, Finland). Participants were instructed to wear the monitor on the wrist of their non-dominant hand continuously for 24 hours for 14 days. Polar Active has a uniaxial accelerometer that outputs estimated energy expenditure in metabolic equivalent (MET) values every 30 seconds. The validity of Polar Active under free-living conditions against the double-labeled water technique has been shown elsewhere (27).
Environmental measures
We obtained the residential coordinates of all participants whose residences were available at the time of the 46-year follow-up data collection (2012–2014) from the Finnish Population Register Centre. We used a geographic information system (ArcGIS 10.3) to calculate built, natural, and socioeconomic environment variables (Supplementary File 1, Table S1) that might describe the conduciveness of participants’ residential environment to PA. We calculated all variables in the year the participant attended the 46-year data collection. We also determined quantitative environmental features using a one-kilometer-radius circular buffer around the residential locations, and the distances (as the crow flies) to amenities were measured using road network data.
Data related to community structure; land use; amenities such as retail, recreation, office, and community institutions; and socioeconomic factors were derived from the Finnish community structure database (28). Street network data, including the number of bus stops, intersection density, and length of cycle paths, were based on the Finnish national road and street database (Digiroad) (29). Data on indoor and outdoor sport facilities were obtained from the Finnish database of sport facilities (30). Natural environment features such as distances to the closest forests and parks and residential area greenness were assessed with the land cover data from the Finnish Environment Institute (31).
Data mining using a decision tree
We selected a decision tree technique to establish a data-driven model for classifying PA behavior. A decision tree model is created by partitioning the data on the basis of several independent input variables (or predictors) to form homogenous subgroups with respect to the outcome variable. A decision tree-produced hierarchy has a flow chart-like structure that enables identifying the relative importance of input variables in predicting the outcomes; the predictors in the higher layers of hierarchy are more important predictors (32). In clinical applications and several other areas in which interpreting the results is of vital importance, decision trees are one of the most widely used classification methods (12,14,32,33).
We used the Chi-squared Automatic Interaction Detection (CHAID) decision tree algorithm to create the model (34). CHAID has been repeatedly used in studies with clinical applications whose main purpose was to identify key factors related to the outcomes of interest (35,36). In this algorithm, homogenous groups may be formed by any possible combination of the known values of a categorical predictor, or by setting cut-off points at any values of a continuous predictor. The number of selected independent predictors for creating the model together with the number categories (for categorical and ordinal) and intervals (for continuous) for the selected independent predictors depends on results of the Chi-square analyses and whether the differences are significant or not. Since the correlates of PA behavior could be of mixed data types, CHAID is an appropriate candidate because it uses a nonparametric procedure with no assumptions of the underlying data and is designed to include continuous, ordinal, and categorical predictors (33).
Decision tree model construction and validation
Input variables (predictors) and physical activity behavior (outcome variable)
The questionnaire and clinical and environmental measures, except those with more than ~10% missing values, were used as input variables. Recent evidence suggests that any single unidimensional metric (including the most commonly used criterion that defines physical inactivity as the insufficient activity level to meet present recommendations (1)) might not be enough to define individuals’ PA behavior (10,37–39). We therefore used participants’ activity profiles, which we built in a previous study using a multidimensional approach and continuous accelerometer data to define the PA behaviors for the present study (20). A distinct aspect of this approach is that continuous accelerometer-measured activity intensities in one full week across the whole intensity continuum, including sedentary (SED), light PA (LPA), and MVPA were incorporated into a machine learning approach to create the activity profiles.
The details about how the activity profiles were established have been presented elsewhere (20). Briefly, X-means clustering algorithm was applied on accelerometer-based MET-level data of participants who had seven consecutive valid measurement days (N = 4,582), and four distinct activity profiles (clusters) were derived. A total of 1008 features/variables (10-minute averages of the original 30-sec MET data resulting in 144 MET values for each of the 7 valid measurement days) for each participant were fed into the clustering algorithm for creating the profiles (20). A valid measurement day was defined as at least 600 minutes of activity monitor wearing time per day during waking hours. Seven consecutive valid measurement days were used as a criterion to enable analyzing one full week including both weekdays and weekends. The activity profiles were named with respect to the temporal and intensity patterns of participants’ daily activities in each cluster: Inactive (N = 1,881), Moderately active (N = 802), Evening active (N = 1,297), and Very active (N = 602). The results of our initial experiments revealed the decision trees induced for classifying the four activity clusters have unreasonable performance and generalizability, primarily because the outcome variable had both class imbalance (i.e., 41% Inactive, 18% Moderately active, 28% Evening active, and 13% Very active) and class overlap (i.e., those who were in the Moderately active, Evening active, and Very active had comparable activity profiles with different temporal patterns) problems (40). Previous research has shown that the effects of these two problems that associate with each other in limiting the performance and generalizability of classification trees is best minimized with near-balanced class distribution in the outcome variable (41). We therefore defined those in the Moderately active, Evening active, or Very active clusters as active (N = 2,701), and the remaining ones who were in the Inactive cluster as inactive (N = 1,881). We used the input variables in their original form to classify the two PA behavior categories: active and inactive.
Missing values and algorithm parameters
Missing values were included in the analysis as a separate category that was allowed to merge with other categories in the decision tree. The imputation of missing values of input variables was unnecessary (35). A previous study has shown that the a decision tree developed with the presence of missing values in their input variables has reasonable misclassification rates, especially when the missing values are not very high (e.g., 20%) (42).
Several parameters must be set prior to constructing a decision tree model. Of these parameters, pruning criteria are the most primary ones to limit the size of the tree and prevent overfitting (14). The pruning criteria were set such that groups smaller than 80 were not split any further (maximum number of participants in a parent node), and no group smaller than 40 was formed (maximum number of participants in a child node). The tree growth was limited to 10 layers, meaning that a maximum of 10 factors could be selected to form a group.
Model validation and visualization
We created and validated the model using 10-fold cross-validation. To evaluate the accuracy of the final decision tree model, we used the confusion matrix, which shows the proportion of participants with each outcome variable that was correctly and incorrectly classified. In the visualization of the final tree, the percentage of active and inactive participants in each subgroup, along with the response index (RI), was presented. The RI is the percentage of inactive participants in each subgroup relative to that of inactive participants in the total sample (i.e., 41.1%). Similar to an odds ratio, RI is an indicator of the direction and strength of the association (16).
Activity patterns in decision tree-formed subgroups of participants
Given that the outcome variable was formed with a multidimensional approach, we also calculated Z-scores of three PA metrics including average daily time (minutes per day [min/day]) spent in SED, LPA, and MVPA in each decision tree-formed subgroup of participants. A Z-score indicates how many standard deviations the mean of a measure in a subgroup is away from the corresponding mean in whole study population. As such, we could compare the variation of the three activity intensities across different subgroups with respect to the study population means. We calculated these three PA metrics from the same seven consecutive valid measurement days to establish the activity profiles (20) using previously validated cut-points (SED, 1–1.99 MET; LPA, 2–3.49 MET; and MVPA, ≥ 3.5 MET) by the accelerometer manufacturer (43).
Association analysis
The same above-mentioned PA metrics (SED, LPA, and MVPA) were also used for association analyses. We examined the association between factors emerging from the model and these PA metrics to determine the significance and relative importance of the methodologically identified factors. We used adjusted generalized linear mixed models, including urban–rural area as a random effect, to examine the associations between each independent variable (factor emerging in the decision tree) separately with min/day in SED, LPA, and MVPA. Age and gender were used as covariates in all models. We standardized the continuous independent variables to obtain a mean of zero and a standard deviation (SD) of 1 before including them in regression analyses. As such, we could interpret coefficients (B) from the models encompassing a continuous independent variable as a change in the outcome (e.g., min/day of LPA) for every 1 SD change in the independent variable and therefore compare them to each other across a similar outcome in terms of magnitude regardless of the unit. We included the categorical and ordinal independent variables in the regression analyses in the form of dummy variables and set response categories at the lowest end as the reference category. A p-value of 0.05 was used to interpret significance. All analyses (including data mining) were performed with IBM SPSS Statistics for Windows, version 25.0 (IBM Corporation, Armonk, USA).