Study design and participants
We used data from the Chinese Longitudinal Healthy Longevity Study (CLHLS), which are publicly available from Peking University Open Research Data (https://opendata.pku.edu.cn/dataverse/CHADS). The baseline and follow-up surveys were conducted in 1998, 2000, 2002, 2005, 2008–2009, 2011–2012, and 2014 in a randomly selected half of the counties and cities in 23 out of 31 provinces in China. The study was the first national longitudinal survey on determinants of healthy aging among the oldest old individuals in China. Details of descriptions of the CLHLS including the rationale and design have been described previously [31]. With 631 cities and counties randomly selected as the sample sites, the study sample roughly represents about 85% of the Chinese population (Fig. 1). CLHLS was approved by the Institutional Review Board, Duke University (Pro00062871), and the Biomedical Ethics Committee, Peking University (IRB00001052–13074). All participants or their legal representatives signed written consent forms to participate in the baseline and follow-up surveys. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines.
There was a total of 34394 participants in 5 waves of CLHLS, recruited from 2000 to 2011. We excluded 545 participants aged 64 or younger, 25220 participants without available SIRT1 genotypic data, 212 participants with missing PM2.5 values, 567 participants were not Han ethnicity (according to the ID card or household registry) or having missing value in covariates; 767 participants lost to follow-up at the first follow-up survey. Accordingly, the final sample that met inclusion criteria for this study was 7083 participants (Figure S1). The sample consisted of 3677 women and 3406 men; 3272 participants were 65 to 79 years of age, 1840 were 80–89 years of age, 1305 were 90–99 years of age, and 667 were 100 years of age or older. To test the possibility of potential selection bias, gender, age, and residence were compared between participants who lost to follow-up (767 participants) or not (7083 participants) at the first follow-up survey; significant difference for ages (86.0 vs 81.1) and residence (rural: 53.8% vs 66.7%) between two groups, while no significant difference for sex (female: 52.7% vs 51.9%).
Procedures
We estimated ground-level concentrations of PM2.5 from the Atmospheric Composition Analysis Group based on participants’ residential address [32]. It combines remote sensing from National Aeronautics and Space Administration’s Moderate Resolution Imaging Spectroradiometer, Multiangle Imaging Spectroradiometers, and Sea-viewing Wide field-of-view Sensor satellite instruments; vertical profiles derived from the GEOS-Chem chemical transport model; and calibration to ground-based observations of PM2.5 using geographically weighted regression. Annual PM2.5 estimates were calculated from 2000 to 2014, at 1 km x 1 km spatial resolution, which was the longest and the highest resolution exposure dataset available [33, 34]. Additionally, our estimations were highly consistent with out-of-sample cross-validated concentrations from monitors (R²=0·81) and another exposure dataset in China (R²=0·79) [32]. Previous study found that three-year average PM2.5 before death or the end of the study had the strongest association with mortality among old adults in China [33]. Therefore, we used three-year average PM2.5 to reflect ambient air pollution in this study.
We selected candidate SNPs from the public database of the National Center for Biotechnology Information (NCBI; http://www-ncbi-nlm-nih-gov/gene/23411) to cover the SIRT1 gene region in equally spaced intervals. The minor allele frequencies (MAF) of the polymorphisms required to be > 10% [35]. The selected and genotyped SIRT1 SNPs were rs12778366 (promoter), rs3758391 (promoter), rs2273773 (exon), rs2236319 (intron), rs1885472 (intron), rs7069102 (intron), rs10823112 (intron), rs3818291 (intron), and rs4746720 (intron). Among the 9 SNPs, SIRT1_366 (rs12778366), SIRT1_391 (rs3758391), SIRT1_773 (rs2273773), and SIRT1_720 (rs4746720) were annotated and regarded as tagging SNPs due to the high linkage disequilibrium, which were used as proxies for the rest of 5 SNPs in the analysis (Table S1). The Hardy-Weinberg equilibrium of the 9 SIRT1 SNPs was tested with the GENEPOP package (version 1.2). To determine the SIRT1 carrier, we coded the genotype based on the minor allele number [36]. In the additive model, the genotype that contains zero, one, or two copies of minor allele was coded as 0, 1 or 2; In the dominant model, the genotype that contains at least one copy of minor allele was coded as 1 and otherwise it was coded as 0; In the recessive model, the genotype that contains two copies of minor allele was coded as 1 and otherwise it was coded as 0.
The primary outcome was all-cause mortality. Mortality information was obtained from the follow-up survey done in 2011 and 2014. The date of death would be validated by death certificates when available - otherwise, the close family member’s report was used.
Covariates were chosen as potential confounders between exposures and outcomes or predictors of outcomes. All self-reported information was collected through face-to-face home interview by trained research staff members. Interviewees were encouraged to answer as many questions as possible. If they were unable to answer questions, a close family member or another proxy, such as a primary caregiver, provided answers. We included baseline age, gender, marital status, residence, education, occupation, smoking status, drinking status, physical activity and wave of first interview as covariates. We classified marital status into two categories: currently married and living with spouse as married, and widowed/separated/divorced/never married/married but not living with spouse as not married. We divided residence into urban and rural areas based on governmental administrative categories. We used the schooling year to evaluate education level. We categorized the occupation into two groups: professional and technical personnel, governmental, institutional, or managerial personnel as non-manual, and agriculture, forest, animal husbandry, fishery worker, industrial worker, and others as manual. We divided the regular exercise, smoking, and alcohol drinking status into three categories: “Current”, “Former”, and “Never”. For example, participants were asked “do you do exercise regularly at present (planned exercise like walking, playing balls, running and so on)?” and/or “did you do exercise regularly in the past?”. We defined the regular exercise status as “Current” for participants who answered “Yes” to the first question, “Former” for who answered “No” to the first question and “Yes” to the second question, and “Never” for who answered “No” to both two questions. We categorized the participants into two geographical regions: South China (Guangdong, Guangxi, Hainan, Chongqing, Sichuan, Anhui, Hubei, Fujian, Jiangxi, Jiangsu, Shanghai, and Zhejiang) and North China (Beijing, Shandong, Heilongjiang, Jilin, and Liaoning, Hebei, Henan, Shanxi, Tianjin, and Shaanxi).
Statistical analysis
We used cox proportional hazard model for every SIRT1 SNP and three-year average PM2.5 separately to evaluate their single effect on mortality. We added the interaction term of SNP and three-year average PM2.5 in the cox model to investigate the interaction of SIRT1 and three-year average PM2.5. The genotype can be defined as low-risk genotype (G0) and high-risk genotype (G1) based on the SNP’s effect on mortality. The three-year average PM2.5 was discretized into two categories as low PM2.5 exposure (E0) versus high PM2.5 exposure (E1) using the median of three-year average PM2.5 as the cut-off point. The gene effect in different environment exposure and environment effect among participants with different genotypes can be evaluated by comparing the difference among the four categories (G0 × E0, G0 × E1, G1 × E0, and G1 × E1) of the combination term. To further investigate the gender-specific G × E effect, we adopt an integrated statistical model of three-way interaction to assess varied effect magnitude in the hazard ratio of mortality between those who have different combinations of sex, SIRT1 genotypes, and exposure to PM2.5 [37].
We measured the survival time in months from the first interview date to the recorded death date or last interview date. All models adjusted for age, gender, marital status, residence area, education, occupation, smoking status, drinking status, and physical activity. We calculated hazard ratios (HRs) and 95% CIs to indicate the effect magnitude of SIRT1 and PM2.5 on mortality. We analyzed for effect modification by potential modifier variables, then did stratified analyses by sex, urban or rural residence, financial status, smoking status, and North or South geographical regions.
We did additional analyses to verify the results. We compared baseline characteristics between the included participants and all participants in the five waves of CLHLS to assess the sample's representativeness. We also removed the participants who were died in six months. We adjusted the regression models by using more informative covariates and by excluding covariates with missing values. We also plotted the geographical distribution of the included participants.
We used R 3.6.1 (R Foundation for Statistical Computing) and SAS university edition to perform all the analyses. All p values were from 2-sided tests and results were deemed statistically significant at p < .05 for all analyses.