We performed EWAS of objectively measured physical activity using methylation data from peripheral blood leukocytes in a cross-ancestry sample of pregnant women. We identified two CpG sites associated with SB and 122 CpG sites associated with MPA after controlling for the number of steps/day in addition to observed confounders, suggesting that these CpG sites are related to SB and MPA, independent of general physical activity. The majority of these CpG sites examined for further analysis (n = 10 for MPA, and n = 2 for SB) were associated with BMI, with the same direction in relation to SB and in the opposite direction for MPA. Both SB related CpG sites were located in VSX1 and were associated with genetic variants in cis. Three of the examined MPA related sites, cg05094046, cg11949866, and cg07919197, were associated with gene expression in CD4 + cells (MESA, n = 1202). Findings were not replicated in adolescents with Actigraph data in an independent cohort (ALSPAC).
Our model adjusting for steps per day was designed to isolate the specific effect of SB and MPA on DNA methylation independently of general physical activity level. The reasoning for doing this was based on previous studies that suggest that SB is associated with increased risk of mortality and morbidity when adjusting for moderate to vigorous physical activity (36, 37). Given this background, we hypothesized that adjustment for number of steps (27), we could estimate the specific effect of SB and MPA on DNA methylation independent of general physical activity level. The emergence of the associations we observed could be interpreted as potential effects of SB and MPA independent of general activity. However, we acknowledge that these results could also be explained by bias due to inflation, as suggested by the elevated lambda. Furthermore, although the number of steps may not causally alter DNA methylation levels, collider bias cannot be ruled out completely and could potentially be a source of bias. This is because steps per day may be a potential mediator and not a confounder, meaning that there could be unknown confounders between steps per day and DNA methylation that could introduce bias (38).
In comparison to previous studies, we explored replication in an independent cohort (ALSPAC) ALSPAC. It used an earlier DNA methylation chip which covered fewer CpG sites in comparison to the iteration used in our discovery sample. Of the 123 discovered CpG sites, 76 could be explored for replication, and we did not find evidence of it. This might be because some of our discovery results are false positives, underlying differences between the two study populations (EPIPREG consisted of adult pregnant women while ALSPAC consisted of adolescents of both sexes), or because we could not adjust for steps per day in ALSPAC. In our attempt to verify previously identified PA related CpG sites from published studies, only cg17385847 (NPM1) from the EWAS of objectively measured PA (11) had p < 0.05 in EPIPREG. The scarcity of associations could be due to false positives or underlying methodological differences across studies, such as questionnaire assessed vs. objectively measured PA, different statistical models, or different categorization of PA. Furthermore, the objective measured physical activity findings of Fox and collaborators (11) did not reach genome-wide significance which increases the risk of false positives. Lastly, these studies included both men and women, while our population consisted of pregnant women.
The inclusion of the total daily number of steps as a covariate did not affect the VIF scores, but the calculated blood cell composition introduced multi-collinearity in the model. Although it is recommended to adjust for the major blood cell types in peripheral blood leukocytes, it is known that the Houseman method to calculate white blood cell composition indeed could introduce multi-collinearity (39). However, by excluding CD4T and CD8T, which were highly correlated with Neutrophils in our data, the VIF scores improved, and the coefficients changed marginally. Hence, the inclusion of the six major cell types in the model is unlikely to affect the conclusions. Another important possible bias is that physical activity was recorded after blood sample collection and GDM diagnosis, which could have affected the women's physical activity patterns, i.e. being more active after diagnosis. However, adjustment for GDM in sensitivity analyses suggested that GDM had little impact on the reported associations.
Among the CpG sites selected for further analyses, the effect sizes for six of the ten MPA related CpG sites were inversely related to SB and BMI. In contrast, the effect directions for the SB related CpG sites and BMI were consistent. These relationships follow a trend similar to that of other studies where MVPA is usually negatively associated with BMI (40–42), while SB is positively associated with BMI (41).
Some of the examined CpG sites were in genes previously related to cardiometabolic health. Both SB related CpG sites (cg26698820 and cg19592637) were located in VSX1, a gene previously identified in a HbA1c GWAS (43). Among the MPA related CpG sites, cg05094046 (DYM) and cg11949866 (CASZ1) lay in genes important for the homeostasis of the cardiovascular system. CASZ1 is an essential gene for cardiac development (44), and loss of function mutation in this gene is associated with hereditary dilated cardiomyopathy (44). Moreover, genetic variants in CASZ1 have been associated with LDL cholesterol (45), total cholesterol (45), ischemic stroke (46), and both systolic and diastolic blood pressure (45). DYM has genetic variants previously associated with body fat percentage (47), and coronary artery disease (48). Methylation of cg05094046 (DYM) and cg11949866 (CASZ1) were associated with gene expression in MESA, implying that the methylation changes induced by MPA have transcriptional implications. However future studies are needed to replicate these associations and to evaluate whether these potential changes in methylation are long-term.
Among the MPA related CpG sites, cg07919197 was located in a gene previously associated with exercise (TXLNA) (49). TXLNA codes for the cytokine interleukin 14, which plays a role in B-cell proliferation and antibody formation (50). Kisma and collaborators (49) observed that TXLNA’s expression varies during exercise, as it increased immediately post intense exercise, but decreased 15 minutes after recovery. Although the conclusions of this study should be taken with caution as the total sample size was very small (n = 3), methylation at cg07919197 was associated with gene expression in TXLNA. The last hints that physical activity modifies TXLNA expression through DNA methylation changes.
Major strengths of this study are the well-characterized, population-based cohort, objectively recorded physical activity, and inclusion of two ancestries. Another strength is the availability of genetic data in our sample, which allowed us to perform mQTL analysis. Important limitations include the limited sample size. The DNA methylation quantification in EPIPREG was done in peripheral blood leukocytes while the expression analysis in MESA was only done in isolated CD4+, hence there could be differences in gene expression across studies due to differences in white blood cell composition. The inflation presented in model 2 could be indicative of potential false positives. Lastly, we lack a replication cohort with similar cohort characteristics and data to verify our findings.
In conclusion, we identified two cross-ancestry CpG sites associated with SB and 122 for MPA. Overall, this study provides new insights into the epigenomic associations with physical activity that may be further explored.