Study Design and Participants
This was a cross-sectional analysis of baseline data from a multisite clinical trial with four sites: University of Delaware, University of Pennsylvania, Christiana Care Health System, and Indiana University (NCT02835313) (39). To be included in this analysis, the following eligibility criteria were employed: Inclusion: (1) Ages 21–85, (2) ≥ 6 months post stroke, (3) Able to walk at a self-selected gait speed of ≥ 0.3 m/s without assistance from another person (assistive devices allowed), (4) Resting heart rate between 40–100 beats/minute, (5) Resting blood pressure between 90/60 to 170/90 mmHg; Exclusion: (1) Evidence of cerebellar stroke, (2) Other potentially disabling neurologic conditions in addition to stroke, (3) Lower limb Botulinum toxin injection < 4 months earlier, (4) Current participation in physical therapy, (5) Inability to walk outside the home prior to stroke, (6) Coronary artery bypass graft, stent placement, or myocardial infarction within past 3 months, (7) Musculoskeletal pain that limits activity, (8) Unable to provide informed consent as indicated by an inability to answer at least 1 orientation correctly (item 1b on the NIH Stroke Scale) and inability to follow at least one, two-step comment (item 1c on the NIH Stroke Scale). In addition, only participants with complete data for the activity measures, 6MWT and SBP were included in this analysis. All participants signed informed consent approved by the Human Subjects Review Board at the University of Delaware or their respective institution prior to study participation (protocol number 878153-50).
Theoretical Framework
In order to determine which measures of real-world activity to include in our statistical analysis, we first conducted a review of the literature in people with stroke (2, 7–12, 14, 16–18, 20, 21, 23–25, 27, 28, 30, 32, 35, 40–45) and other populations (26, 33, 46–56) that measured real-world walking activity. This literature search resulted in over 30 different measures of real-world activity. We then systematically eliminated measures that were derivatives of each other (e.g., Peak 1, a measure of real-world activity intensity used in some studies (26, 50), is similar to Peak 30 (20, 50)) and measures that could be problematic in stroke. For example, METS is a common way that activity intensity is quantified (25, 27, 57, 58); however, there are limitations to using METS in people with stroke (59–61). This process resulted in a smaller subset of measures identified as being potentially relevant for people with stroke and the current work. Once this smaller subset was identified, the measures were then grouped under specific domains based on our knowledge of stroke and past literature suggesting that different activity measures assess different constructs (11, 12, 21, 25, 26, 32, 33). Figure 1 provides a visual representation of the end result of this process and shows our theoretical framework for conceptualizing activity behavior. The model shows that activity behavior is comprised of four domains: activity volume, activity frequency, activity intensity, and sedentary behavior. Each of these domains is intended to reflect an important but unique aspect of a stroke survivor’s overall walking activity behavior and are discussed in greater detail below.
Activity Volume. This domain is intended to capture a person’s overall volume of activity and encompasses measures such as averages steps/day (ASPD) (33) and time walking per day (23, 43) which provide an global representation of a person’s overall volume of activity over a particular period of time. As discussed above, past work suggests that measures of activity volume, specifically ASPD, may be insufficient for understanding the relationship between activity and health and that additional or alternative measures are needed (12, 23–25, 30, 33).
Activity Frequency. Previous work suggests that the frequency (i.e., bouts) in which activity is accrued throughout the day differs in people with stroke compared to healthy controls (7, 23, 44). In addition, longitudinal studies in stroke have demonstrated that increases in activity volume (i.e., ASPD) may be partly explained by increases in the number of walking bouts (16, 62). This suggests that the frequency in which activity is accrued may provide unique and important information beyond measures of activity volume (i.e., ASPD). The number of long (23, 44, 52) and short bouts (23, 44, 52) of walking activity as well as the overall number of walking bouts per day (23, 43, 44) were considered measures of activity frequency.
Activity Intensity. Stroke prevention guidelines suggest that individuals with stroke should engage in moderate-to-vigorous intensity aerobic physical activity to lower their risk of recurrent stroke and cardiovascular events (63). This suggests that the intensity of walking activity may also be important when monitoring real-world walking behavior in people with stroke. In support of this point, Fini and colleagues found that time spent in moderate-to-vigorous physical activity was associated with important cardiovascular risk factors in people with stroke over a two-year monitoring period (25). This study, among others (21, 24, 32, 41), provide support that the intensity of real-world activity may be important in addition to the overall volume of activity. Peak 30 and average bout cadence were considered measures of activity intensity (20, 33, 45, 50).
Sedentary Behavior. There is growing consensus that sedentary time is an independent construct of active time (7, 8, 10, 24, 27, 28, 48, 64). Previous studies have shown that time spent in sedentary behaviors is associated with negative health outcomes, independent of active time (48, 65, 66). Other studies have shown that breaking up the amount of time spent in sedentary behaviors has positive effects on cardiometabolic markers, such as blood glucose, systolic blood pressure and body mass index (30, 55, 56). Taken together, these findings suggest that in addition to measuring time spent in active behaviors, time spent in sedentary behaviors should also be measured when attempting to understand the relationship between activity and cardiovascular risk. The percentage of time spent in sedentary behaviors (25, 28, 53, 54), the number of long sedentary bouts (11, 25, 46), and the fragmentation index (24, 30, 55, 56) were considered measures of the sedentary domain.
Measures
During the baseline visit of the clinical trial, demographic information (i.e., age, gender, race) and stroke information (i.e., time since initial stroke) were collected. Participants’ resting blood pressure was collected in accordance with the American College of Sports Medicine (ACSM) guidelines (67). Specifically, blood pressure readings were obtained with the participant seated in a chair with back support for at least 5 minutes, their legs uncrossed, and the arm supported at the level of the heart. A minimum of two readings were obtained with at least 1-minute between readings. The two readings were averaged to represent the participant’s resting blood pressure (67). However, if a difference of > 5 mmHg was observed between the first and second readings, an additional reading was obtained, and the average of these multiple readings was used.
To measure walking capacity, participants completed the 6-Minute Walk Test (6MWT). Participants were instructed to walk continuously as fast as possible for 6 minutes around a 42-meter rectangular track (68). The 6MWT is a valid and reliable test of walking endurance in people with stroke (69, 70).
To measure real-world walking activity, participants were provided with a Fitbit One or Fitbit Zip to wear on their non-paretic ankle. The Fitbit has demonstrated acceptable accuracy in detecting stepping activity in people with stroke (71–74). Participants were instructed to wear the device for 7 days; however, a minimum of 3 days of activity was required (19). Participants were instructed to go about their usual activity while wearing the device and to remove it for water-based activities and sleep. Upon returning the device, a trained physical therapist inspected the data to ensure the minimum wear criteria was met. To determine valid recording days, the participant was queried about any inconsistencies or irregularities in the data. The days in which participants were issued and returned the device were not counted towards the 3-day minimum, nor were any days in which the participant did not wear the device during waking hours.
Data Processing
Figure 2 displays a data pipeline that demonstrates how the data were processed and analyzed. Participants’ step data was exported into 60-second sampling epochs to calculate the activity measures of interest (Fig. 2: “Raw Data (60-sec epoch)”). The first stage of data processing involved determining “wear” and “non-wear” time using the R package “accelerometry” (75). We employed a two-step process to determine an appropriate “non-wear” window and increase our confidence in this decision (Fig. 2: “Testing non-wear intervals”). First, non-wear windows of 3 hours through 6 hours were tested, and the number of sedentary and non-wear minutes were compared using a within-subjects analysis of variance (ANOVA) where the non-wear window was the within subjects variable. Post-hoc testing was conducted if the model was statistically significant. Second, a clinician with expertise in stroke rehabilitation independently coded whether each minute was “non-wear”, “sedentary” or “active” time for a random subset of 10 participants, and these results were compared to those of the different non-wear windows. These steps revealed significant differences (p < 0.05) in the number of sedentary and non-wear minutes for the 3-hour non-wear window compared to all other non-wear windows. Comparing these results to the clinician responses revealed the highest agreement with the 4-hour non-wear window (> 85% agreement for all 10 participants, mean agreement of 94.67%). We therefore determined the 4-hour non-wear window was most appropriate. Under this definition, “non-wear” time was defined as any interval of at least 240 consecutive minutes (4 hours) with 0 steps, allowing for 2 spurious minutes of activity of up to 2 steps each minute. Non-wear minutes were then removed from further analysis (Fig. 2: “Remove Non-Wear Time”). Any minutes that did not meet this criterion were defined as “wear” time. “Wear” time was further categorized as “active” or “sedentary” (Fig. 2: “Distinguish Wear Time as Active or Sedentary”). “Active” minutes were any minutes with at least 1 step, with the exception that a minute with only 1 step could not have a minute of 0 steps before and after it. All other “wear” minutes that did not meet this criterion were considered “sedentary” minutes. For example, a series of minutes with 0 steps, 1 step, 0 steps would be labeled: sedentary, sedentary, sedentary. A series of minutes with 10 steps, 12 steps, 0 steps, 20 steps would be labeled: active, active, sedentary, active. The activity measures were calculated from the “active” and “sedentary” time (Fig. 2: “Processed Activity Measures”). For example, the average time walking/day was calculated from the “active” minutes, and the percent sedentary time was calculated using “sedentary” minutes. Table 1 displays the activity measures of interest, the domain of measurement, and how each measure was calculated.
Table 1. Activity Measure Calculations
Statistical Analysis
To address our first objective of identifying a subset of activity measures most strongly related to SBP, two variable selection techniques were utilized. For the primary analysis, lasso regression was employed. Lasso regression applies a penalty, controlled by the parameter λ, that shrinks the regression coefficients closer towards zero such that some of the variables (i.e., activity measures) are dropped from the model (76, 77). The result is a simpler model containing a subset of variables whose coefficients were not zero. Those “surviving” variables are therefore interpreted as most strongly related to the outcome. For this work, the optimal value of λ was chosen using 10-fold cross-validation which was replicated 100 times to achieve a stable solution (Fig. 2: “10-fold cross-validation x 100”) (76). The optimal value of λ was considered the value associated with the smallest mean squared error on the test data (76, 77). Once this optimal value of λ was identified, the model was then re-fit using all of the data and the optimal value of λ (Fig. 2: “Refit model using all data and optimal λ”). This process resulted in a subset of walking activity measures most strongly related to SBP. The lasso regression was performed using R Statistical Software (v3.6.1) (78) and the “glmnet” package (79).
To increase our confidence in the subset of activity measures retained from lasso, we also utilized the best subset method and compared these results to that of lasso. Unlike lasso, which performs variable selection by shrinking coefficients, the best subset method performs variable selection by fitting separate regression models for all possible combinations of predictors to determine which model (i.e., subset of predictors) is “best” (76, 80). For this work, we determined which model was “best” by examining the AIC (Akaike information criterion), adjusted R2, and the model with the lowest residual sum of squares (Fig. 2) (76). As lower AIC values indicate a better model, we rank-ordered all 1,024 possible models from lowest to highest AIC and selected the model with the lowest AIC value (76). As higher adjusted R2 values indicate better fit, we rank-ordered all possible models from highest to lowest adjusted R2 and selected the model with the highest adjusted R2 value (76). As the residual sum of squares (RSS) always decreases as more variables are added to the model, we utilized the number of variables retained from lasso (p) and identified the best p-variable model with the lowest RSS. For example, if lasso retained 2 variables as most strongly related to SBP, we identified the best 2-variable model with the lowest RSS. The result of this step was three models (i.e., subsets of predictors) with the lowest AIC, highest adjusted R2, and lowest RSS. The best subset models were conducted using the regsubsets function within the “leaps” package (81) in R as well as the Regression Best Subsets extension in SPSS Version 28.0, Armonk, NY: IBM Corp. These results were compared to that of lasso (Fig. 2: “Measures Common among all Models”). Measures that were common among all approaches were fit in a linear regression model (discussed below).
Sequential linear regression was used to address our second objective of understanding if the subset of activity measures selected as described above were significantly related to SBP above and beyond walking capacity (Fig. 2: “Final Regression Model with Covariates & 6MWT”). In this approach, predictors are entered in blocks and the change in R2 value is evaluated after each block entry to determine if the block is significantly related to SBP after adjusting for the previous blocks (82). The first block of predictors included covariates, specifically age, gender, race, and time since initial stroke. Gender was coded as male (0) or female (1). Race was categorized as white, black, and other which consisted of individuals who identified as races other than black or white (e.g., Asian) or identified as being more than one race. Race was then dummy coded as white (0) compared to black (1) and white (0) compared to other (1). Walking capacity (i.e., 6MWT) was entered into the second block. The third block consisted of the common activity measures among lasso and best subset models. All regression assumptions were tested and met. The sequential linear regression was conducted in SPSS Version 28.0, Armonk, NY: IBM Corp.