2.1. Study design
Among the 1203 70-year-olds born 1944 from the Swedish population-based Gothenburg H70 Birth Cohort Study (H70-1944) (for the full cohort description see Rydberg et al. [19]), 791 underwent brain MRI scan (response rate = 65.8%) at baseline (Jan 2014-Dec 2016) Figure 1. Of these 3 subjects due to failed FreeSurfer preprocessing, 20 individuals with poor image segmentation, 8 with dementia diagnosis, 2 with MMSE <24, 3 with PD, 2 with Multiple Sclerosis, 6 with epilepsy, and 1 with brain cancer were excluded. This yielded a final sample of 746 individuals (cerebrospinal fluid sampling available for a subset of 286 individuals), including 353 men and 393 women. The H70 study received ethical approval from the Regional Ethical Review Board in Gothenburg. Written informed consent in agreement with the Helsinki Declaration was obtained from participants or, when the person was unable to provide their consent, from their next-of-kin.
2.2. Brain MRI markers of neurodegeneration and cerebrovascular pathology
Participants were scanned on a 3.0T MRI scanner (Philips Medical Systems). The imaging protocol included a three-dimensional (3D) T1-weighted Turbo Field Echo (TFE) sequence to assess structural changes; T2-weighted images to exclude other pathologies (e.g., tumors) and to evaluate enlarged perivascular spaces; a fluid-attenuated inversion recovery (FLAIR) sequence for the detection of as white matter hyperintensities volume (WMHV) and lacunes; a diffusion tensor imaging (DTI) sequence to determine the white matter microstructural integrity; and venous bold sequence (VenoBOLD) for the detection of microbleeds. For detailed MRI acquisition parameters see Rydberg et al. [19]. All images were processed and stored through the-Hive database system at Karolinska Institute [20], and underwent quality control as described previously [21].
2.2.1. Structural imaging. Brain MRIs were automatically pre-processed using FreeSurfer 7.2. Cortical thickness measurements were extracted for a total of 34 cortical regions of interest (ROIs) using the Desikan atlas [22]. Additionally, volume measurements were extracted for 7 subcortical ROIs, including the hippocampus, thalamus, amygdala, putamen, globus pallidus, nucleus accumbens, and caudate nucleus. Left and right hemisphere measurements were averaged for all ROIs. Subcortical volumes were adjusted for total intracranial volume (TIV) to account for natural interindividual variability in head sizes using residuals of a least-squares-derived linear regression between each volume and TIV from FreeSurfer [23]. Mean cortical thickness was used as a marker of non-AD-specific neurodegeneration [24]. A cortical signature of AD-specific neurodegeneration was calculated by averaging bilateral entorhinal, inferior temporal, middle temporal, and fusiform thickness, adjusted by cortical surface areas [25]. A cortical signature of brain resilience was calculated by averaging anterior cingulate and temporal pole thickness, adjusted by cortical surface areas [26].
2.2.2. FLAIR. FLAIR-derived WMHV was automatically segmented using the Lesion Segmentation Toolbox (LST 2.0.15) in the SPM software (https://www.fil.ion.ucl.ac.uk/spm/) [27]. WMHV was adjusted for each participant’s TIV from SPM. The continuous WMHV measure was categorized intro three groups according to the distribution: T1 (<3.2), T2 (3.2 – 6.4), T3 (>6.4 mm3), with higher values indicating greater small vessel vasculopathy.
2.2.3. DTI. DTI-derived measure of fractional anisotropy (FA) was extracted using the FMRIB’s Diffusion Toolbox from FSL (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki) [28]. The FA measure in the white matter used in our analyses is the mean FA of all voxels for each participant computed within the FA skeleton mask, as described in detail elsewhere [29]. The continuous FA measure was categorized into three groups according to the distribution: T1 (<0.3), T2 (0.3 - 0.4), T3 (>0.4), with lower values indicating WM degeneration.
2.2.4. Visual assessment of markers of small vessel disease (SVD). Markers cerebral SVD were visually assessed by a neuroradiologist using standard rating scales and according to the Standards for Reporting Vascular Changes on Neuroimaging (STRIVE) [30]. SVD markers included lacunes, identified as 3-15 mm hypointensities in T1 and FLAIR, or hyperintensities in T2; cerebral microbleeds (CMBs) count according to the microbleed anatomical rating scales (MARS) [31]; perivascular spaces count in the centrum semiovale (PVScs) and basal ganglia (PVSbg), according to the Mac Lullich’s rating scale (0-10 vs ≥11) [32]. Large infarctions were also visually assessed.
Mean thickness, AD and resilience signature thickness, and hippocampal volume were distributed normally, hence used as continuous measures. Conversely, WMHV and FA values were divided into tertiles based on their skewed distribution.
2.3. Non-neuroimaging data
We considered sociodemographic, lifestyle, health-related, and cognitive features as well as blood/CSF biomarkers to characterize the identified patterns. Information on sociodemographic, lifestyle factors, health conditions, and cognitive function were collected by trained research nurses or medical doctors. Detailed assessment procedures [19] and operationalization are published elsewhere [33], thus a brief description is provided below.
2.3.1. Sociodemographic factors. Participants self-reported their biological sex at birth and educational attainment, which was categorized into primary and lower secondary schooling (<9 years of formal education), higher secondary schooling (9 years of schooling or ≤2 years of vocational training), and higher education (>2 years of vocational training or university).
2.3.2. Lifestyle and cardiometabolic conditions. Smoking was dichotomized into never vs current/former smoking. At-risk alcohol consumption was identified if the person consumed ≥100-g alcohol/week (equating to heavy consumption by the National Institute on Alcohol Abuse and Alcoholism). Physical activity was categorized into inactivity (no activity/sedentary most of the day or irregular lighter walks) vs active (regular non-demanding physical exercise 2–4 times/week, demanding physical activities at least 1 h/week, or regular hard exercise). Body Mass Index (BMI) was dichotomized into normal vs overweight/obesity if ≥ 25 kg/m2. Hypertension was defined as systolic blood pressure ≥ 140 or diastolic blood pressure ≥90 mm Hg or current antihypertensive treatment. Cardio- and cerebrovascular conditions (heart diseases such as myocardial infarction, angina pectoris, heart failure, atrial fibrillation, and stroke/TIA) were diagnosed based on examinations, self-report, medication use, or via linkage to the Swedish National Patient Register (NPR), using the International Classification of Diseases-10th edition (ICD-10) codes. Prediabetes and diabetes were identified based on self-reported medical history, use of glucose-lowering treatments (diet, oral hypoglycemic agents, or insulin), or fasting/nonfasting blood glucose of ≥7.0/11.1 mmol/L [33].
2.3.3. Neuro-psychiatric conditions. Depression (major and minor) was diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders (DSM), 4th or 5th editions criteria [34]. Dementia was diagnosed according to the DSM 3rd revised criteria merging neuropsychiatric examination and key informant interviews and used only as exclusion criterion in the current study [35]. Medical history of multiple sclerosis, epilepsy, brain cancer, and traumatic brain injury was based on self-reported general health and close informant interviews.
2.3.4. Cognitive functioning. This was assessed with the a detailed cognitive test battery encompassing five composite domains: (1) episodic memory (Memory in Reality–free recall and 12-object delayed recall, and Thurstone’s picture memory); (2) executive function (Digit Span Backward and Figure Logic); (3) attention and perceptual speed (Figure Identification-PSIF and Digit Span Forward); (4) verbal fluency (Controlled Oral Word Association-FAS and semantic fluency from “animals” task); (5) visuospatial abilities (Koh’s Block Test) [33]. A composite score for global cognitive performance was generated by averaging the z-scores across the five domains.
2.3.5. Blood biomarkers and apolipoprotein (APOE) genotype. Blood sampling was available for all participants and analysed following the standard lab routines at the Sahlgrenska University Hospital. The following indicators of altered lipid metabolism were considered: high triglycerides (≥1.7 mmol/L or use of lipid-lowering medication [Anatomical Therapeutic Chemical code C10]), high-density lipoprotein (HDL) cholesterol (<1.03 mmol/L in men and <1.29 mmol/L in women), and low-density lipoprotein (LDL) cholesterol, which was divided into tertiles (T1 ≤3; T2=3.1-3.9; T3≥4). Additionally, altered homocysteine (>13.5 µmol/L) or C-reactive protein (CRP, ≥8 mg/L) was used as measures of vascular-related and systemic inflammation respectively [17]. Finally, APOE was genotyped, and participants categorized into ε4 allele carriers (one or both ε4 alleles) vs non-carriers.
2.3.6. CSF biomarkers. Lumbar puncture was available for a subset of 286 participants (see Arvidsson Rådestig et al. for detailed CSF sampling procedure [36]). Biomarkers included: β-amyloid 42 (Aβ42) ≤530 pg/mL, total-tau (t-tau) ≥350 pg/mL, and phosphorylated tau at threonine 181 (p-tau) ≥80 pg/mL [36].
2.4 Statistical analysis
2.4.1. Identification of the GM patterns through clustering. For the cross-sectional clustering, we selected 41 brain ROIs, among which 34 measured cortical thickness and 7 measured subcortical volumes. We first applied a Linear Mixed Effect Regression (LMER), controlling for TIV as fixed effect in subcortical ROIs and subject as random effect, and then unsupervised hierarchical clustering using random forest (R software, version 4.0.3, randomForest package, version 4.7-1.1). This clustering method has been previously used to identify atrophy subtypes in AD, PD and DLB [11–13,37]. In our dataset, the optimal random forest parameters were: ntree to 6000, mtry to 6, and nodesize to 3. The optimal cluster solution was established based on a composite score of Dunn and Calinski-Harabasz (CH) indices [38]. For a detailed description of the clustering pipeline see Poulakis et al. [37]. To identify the most discriminating ROIs between the clusters, we then calculated Cohen's d statistics between cluster 1 (utilized as reference group) vs. clusters 2 to 5 based on independent two-samples t-test for each ROI with False Discovery Rate (FDR) correction for multiple comparisons (p value <0.05).
2.4.2. Characterization of the GM patterns. Sociodemographic, lifestyle, health-related factors, cognitive function, and fluid biomarkers between the clusters (1-5) were analyzed using Chi-square test for categorical variables, one-way ANOVA for continuous variables normally distributed and Kruskal-Wallis test for continuous variables not normally distributed. Multinomial logistic regression (MLR) was employed to assess Odds Ratios (ORs) and 95% confidence intervals (CIs) in examining the relationships between the outcome – defined by cluster allocation obtained through unsupervised clustering (with cluster 1 as the reference group) – and sociodemographic, lifestyle, health-related, and fluid biomarkers predictors (as characterizing features or exposures). We fitted the following four separate MLR models: (1) Health and sociodemographic model included sex (reference=men), educational attainment (reference=primary/lower schooling), any APOE-ε4 carriership, MMSE, former/current smoking, at-risk alcohol consumption, physical inactivity, overweigh/obesity, increased triglycerides, increased LDL, reduced HDL, hypertension, heart disease, stroke/TIA, prediabetes and diabetes, depression, and TBI; (2) Cerebrovascular model included mean FA, WMLV, PVScs, PVSbg, CMBs (absence vs present), lacunes (absence vs present), large infarctions (absence vs present); (3) Inflammation model included altered homocysteine and CRP levels; (4) CSF model was run only on the subset of participants with available CSF information, and included Aβ42, t-tau, and p-tau biomarkers. Additionally, we evaluated whether GM pattern predicted cognitive status through six separate Generalized Linear Models (GLMs) with the composite scores for global cognition and the five cognitive domains as outcomes, and cluster allocation as the predictor. To gauge the risk of multicollinearity bias, we assessed the variable inflation factor (VIF) for all the variables included. The VIF consistently remained below 1.7, allowing for the inclusion of all variables. A two-sided p value <0.05 indicated statistical significance. All statistical analyses were performed with R software version 4.0.3.