2.1. Aim and design
The aim of this study was to address the translational challenge by investigating the effects of age-related circulating blood factors identified in murine models on cognitive performance in humans, particularly those at risk for AD. This investigation sought to bridge the translational gap by utilizing a novel approach that could lead to the development of targeted therapeutic strategies for mitigating age-related neurological conditions. Initially, a systematic review was conducted to identify these key blood factors, setting the stage for further analysis. Subsequently, protein-based polygenic risk scores (protPRS) were computed for these factors, utilizing genetic variants known to impact protein expression levels (pQTL). The developed protein scores were then validated against actual protein measurements from two independent human cohorts, ensuring the reliability of our genetic proxies. Finally, we proceeded to investigate the relationship between the validated scores and cognitive performance metrics in cognitively unimpaired (CU) individuals who are at higher risk of developing AD dementia (Figure 1). This approach not only aimed to reflect more immediately the proteome’s impact on neurodegeneration but also provided a crucial test of the translational potential of animal model findings into human health outcomes. By exploring these associations, our study sought to uncover new insights into the biological mechanisms of aging and offer potential pathways for the development of targeted interventions for age-related neurological conditions.
2.2. Systematic Review
We conducted a comprehensive systematic literature review of publications indexed in PubMed (Medline) and Google Scholar until 2023, to identify circulating proteins with demonstrated aging or rejuvenating effects in mice. Our research adhered to a structured review process aligned with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines22. The central research question was formulated as follows: "Which circulating blood proteins have been experimentally shown to manifest either aging or rejuvenating effects on the mouse brain?"
To address this question, we established specific inclusion and exclusion criteria:
Inclusion criteria:
1. Studies that used experimental methodologies such as parabiosis, blood administration from young or aged mice (or vice versa), or in vivo peripheral administration of proteins. Peripheral administration methods encompassed intravenous and subcutaneous routes.
2. Studies assessing aging or rejuvenating effects through assessments of cellular hallmarks of aging. Rejuvenation, in this context, was defined as the process of reversing aging to a more youthful state in terms of organ performance or overall lifespan extension.
3. Peer-reviewed article, written in English, original research studies-only and full-text available.
Exclusion criteria:
1. Studies involving genetic interventions to either increase or decrease protein levels.
2. Studies conducted on disease models (only effects on healthy animals were considered).
3. Studies that utilized antibody administration to block or reduce the effects of the protein of interest.
4. Studies employing analogs of the protein of interest to simulate its effects.
5. Studies involving the administration of proteins that are not physiologically secreted.
Our literature search was systematically performed using Google Scholar and PubMed, employing the following combinations of keywords: "lifespan AND (aging OR ageing OR aged) AND mice", "lifespan AND rejuvenating AND mice", "(parabiosis OR parabiose) AND rejuvenating", "(parabiosis OR parabiose) AND (aging OR ageing OR aged)", "(plasma OR serum OR blood) AND rejuvenating AND factor", "(plasma OR serum OR blood) AND (aging OR ageing OR aged) AND factor".
Initially, we screened the title and abstract to discard irrelevant studies. We also checked the reference lists of included articles to identify additional relevant publications pertinent to our research question. Using Mendeley Cite as reference manager, organized the pre-selected manuscripts in a dedicated folder, and duplicates were removed. Finally, for each paper we screened the full-text to compile a definitive list of manuscripts that adhered to the aforementioned inclusion and exclusion criteria (Supplementary Figure 1).
For the present study, our primary focus is to investigate the potential associations between proteins known to have an aging or rejuvenating effect on the mouse brain and the process of brain aging and cognitive impairment. Therefore, we have selectively extracted from the review only those proteins that exhibit an effect on the brain.
2.3. The Knight-ADRC cohort, genotyping profiling and proteomic assessment.
Sample description
To validate the utility of the protPRSs as genetic proxies for plasma protein levels, we used proteomics and genetics data from the Charles F. and Joanne Knight Alzheimer Disease Research Center (Knight ADRC) cohort. The study was approved by the Institutional Review Board at Washington University School of Medicine in St. Louis. For the analysis, we selected a subset of 1,380 CU participants, who were less than 70 years old. This specific participant subset was selected to enhance congruence with the demographic characteristics of the Fenland study cohort, from which the summary statistics of the plasma pQTLs was extracted. Comprehensive sociodemographic and clinical information for the entire sample is presented in Table 1.
Table 1. Demographic characteristics of the Knight-ADRC study cohort. Data are reported as percentage (%) [sex, Aβ-status, APOE-ε4 carriership] and median (M) and interquartile range (IQR) [age]. Legend: *Aβ positivity (A+) is defined by CSF Aβ42 < 635 pg/ml43 or high amyloid probability score (APS)44. Abbreviations: Aβ, β-amyloid; APOE, apolipoprotein E.
|
Knight-ADRC cohort
|
N
|
Age (years), Median (IQR)
|
68.0 (63.0;72.0)
|
1380
|
Sex, n (%)
|
|
1380
|
Men
|
569 (41.2%)
|
|
Women
|
811 (58.8%)
|
|
Aβ-status*, n (%)
|
|
816
|
Aβ-negative (A-)
|
577 (70.7%)
|
|
Aβ-positive (A+)
|
239 (29.3%)
|
|
APOE-ε4, n (%)
|
|
1355
|
non-carriers
|
741 (54.7%)
|
|
carriers
|
614 (45.3%)
|
|
Genotyping Profiling
Genotyping was performed using multiple genotyping arrays (CoreEx, GSA_v1, GSA_v2, GSA_v3, NeuroX2, OmniEx, Quad660). For each genotyping array, pre-imputation, imputation and post-imputation were performed separately. After post-imputation, all arrays were merged into one dataset. At the pre-imputation stage, quality control was performed on both variants and sample using PLINK v1.9. Genotyped variants were kept agreeing the following quality control criteria: (1) genotyping successful rate ≥ 98% per variant or per individual; (2) MAF ≥ 0.01; and (3) Hardy-Weinberg equilibrium (HWE) (P ≥ 1×10−6). Sample-level quality control included verification of sex codes (filtering out sex mismatches) and the identification of sample ID duplications.
For phasing and imputation, TOPMed server with genome build GRCh38 (imputation panel version R2), imputation quality of R2 ≥ 0.3, and Eagle v2.4 phasing was used45. In the post-imputation phase, both genotyped and imputed variants were retained based on two criteria: (1) a genotyping missing rate of ≤ 90% per variant; (2) a MAF of ≥ 0.0005.
Plasma sample collection and proteomics assessment
The plasma sample proteomics assessment in the Knight-ADRC cohort was extensively discussed in Timsina et al.46 In brief, 3,132 participants and 6,907 aptamers passed proteomics QC. 7,584 aptamers were measured before proteomics QC using the SOMAscan 7k platform. Plasma proteomics data from all genetic ancestries were QCed with seven steps: Step 1) Limit of detection, scale factor difference, and coefficient of variation; Step 2) IQR-based outlier expression level detections; Step 3) Remove Analytes and Samples with <65% call rate; Step 4) Re-calculate call rate for analytes and remove analytes with call rate <85%; Step 5) Re-calculate missing rate for subjects and remove subjects with < 85% call rate cut-off; Step 6) Back transformation into raw values; Step 7) Removal of Non-Human and analytes without protein targets and Final matrix.
2.4. The ALFA+ study, genotyping profiling, protein quantification and cognitive assessment
Sample description
We computed the protPRS in a total of 410 participants of the ALFA+ cohort, a nested longitudinal study of the ALFA parent cohort (Alzheimer and Families) with genotype and cognitive assessment data available47. The ALFA parent cohort was established as a research platform to understand the early pathophysiological alterations in preclinical AD and is composed of CU individuals (between 45 and 75 years at baseline), enriched for family history of AD and genetic risk factors for AD48. In the present study in the ALFA+ cohort, all individuals were CU and within an age range of 45 to 65 years. 60% of the cohort were women, 34% were positive for CSF β-amyloid (Aβ) status, and 55% were identified as carriers of the APOE-ε4 allele. CSF β-amyloid (Aβ) status was defined by the CSF Aβ42/40 ratio, and participants were classified as CSF Aβ-positive (A+) if CSF Aβ42/40 ≤ 0.07149. Comprehensive sociodemographic and clinical information for the entire sample is presented in Table 2.
Table 2. Demographic characteristics of the ALFA+ study cohort. Data are reported as percentage (%) [sex, Aβ-status, APOE-ε4 carriership] and median (M) and interquartile range (IQR) [age at lumbar puncture, years of education, PACC and EM]. *Aβ positivity (A+) is defined by CSF Aβ42/40 ≤ 0.071. Abbreviations: Aβ, β-amyloid; APOE, apolipoprotein E; PACC: preclinical Alzheimer cognitive composite.
|
ALFA+ cohort (N=410)
|
N
|
Age (years), Median (IQR)
|
61.64 (58, 64.8)
|
410
|
Sex, n (%)
|
|
410
|
Men
|
163 (39.8%)
|
|
Women
|
247 (60.2%)
|
|
Education Years, Median (IQR)
|
12 (11, 17)
|
410
|
CSF Aβ42/40 status*, n (%)
|
|
388
|
Aβ-negative (A-)
|
255 (65.7%)
|
|
Aβ-positive (A+)
|
133 (34.3%)
|
|
APOE-ε4, n (%)
|
|
400
|
non-carriers
|
179 (44.8%)
|
|
carriers
|
221 (55.2%)
|
|
PACC, Median (IQR)
|
0.055 (-0.44, 0.46)
|
410
|
Episodic Memory, Median (IQR)
|
0.063 (-0.42, 0.47)
|
410
|
Genotyping Profiling
DNA was obtained from blood samples through a salting out protocol. Genotyping was performed with the Illumina Infinium Neuro Consortium (NeuroChip) Array (build GRCh37/hg19)50. The quality control procedure was performed using PLINK software v.1.9. Samples with a call rate below 98%, mismatched genetically determined sex, or excess heterozygosity (four standard deviations from the mean) were excluded. Individuals with high genetic relatedness (IBD > 0.185) were also removed. After sample-level QC, genetic variants with a minor allele frequency (MAF) below 1%, Hardy-Weinberg equilibrium p-value below 10−6, or missingness rates above 5% were excluded. Imputation was performed using the Michigan Imputation Server with the haplotype Reference Consortium Panel (HRC r1.1 2016) following default parameters. Further details can be found in Vilor-Tejedor et al48.
Plasma sample collection and proteomics assessment
Whole blood was drawn with a 20G or 21G needle gauge into a 10-ml EDTA tube (BD Hemogard, 10 ml, K2 EDTA, catalog no. 367525). Tubes were gently inverted 5–10 times and centrifuged at 2,000g for 10 min at 4 °C. The supernatant was aliquoted in volumes of 0.5 ml into sterile poly(propylene) tubes (Sarstedt Screw Cap Micro Tube, 0.5 ml, PP, ref. no. 72.730.105) and immediately frozen at −80 °C. The samples were processed at room temperature. The time between collection and freezing of plasma samples was <30 min.
Proteomic profiling of non-fasted EDTA plasma samples from 368 participants of the ALFA+ cohort was performed by SomaLogic Inc. (Boulder, CO, USA) using an aptamer-based technology (SomaScan proteomic assay). Relative protein abundances of 7,289 human protein targets were evaluated (SomaLogic v4.1 7K). Protein concentrations are quantified as relative fluorescent units (RFU). Quality control (QC) was performed by Somalogic at the sample and aptamer levels using control aptamers and calibrator samples. At the sample level, hybridization controls were used to correct for systematic variability in hybridization, and calibrators were used to correct for total signal differences caused by assay variance between individuals. The resulting hybridization scale factors and median scale factors were used to normalize data across samples within a run. Aptamer protein mapping to UniProt identifiers and gene names was provided by SomaLogic. Moreover, protein levels of significant proxies associated with cognitive performance in ALFA+ individuals were measured using immunoassay in EDTA plasma fasting samples. Human TIMP-2 Quantikine ELISA Kit #DTM200 (R&D Systems, Inc, Minneapolis, US) Elisa assay was used for plasma TIMP2 measurement, following manufacturer's instructions, at the Clinical Neurochemistry Laboratory, Sahlgrenska University Hospital, Mölndal, Sweden. Outliers in protein expression levels (SomaLogic and ELISA) were identified and removed based on the IQR.
Cognitive assessment
All participants of the ALFA+ cohort underwent a cognitive test battery for the detection of early decline. A composite to assess cognitive performance was created based on the Preclinical Alzheimer Cognitive Composite (PACC)51. For our study, a modified PACC composite was created by averaging the Z-scores of the following variables: 1) Total Paired Recall (TPR) and 2) Total Delayed Free Recall scores of the Memory Binding Test (MBT)52,53, 3) the Coding subtest of the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV), and 4) semantic fluency. Moreover, two additional cognitive composites to assess global episodic memory (EM) and executive function (EF) were calculated by creating Z-scores for the cognitive measures from all MBT scores and from WAIS-IV subtests (Coding, Digit Span, Visual Puzzles, Similarities and Matrix Reasoning) respectively.
2.5. Statistical analysis
Proteomic-based genetic scores quantification
We obtained summary statistics data from Pietzner et al.54 on protein-quantitative trait loci (pQTLs) linked to plasma levels of the proteins identified through a systematic review. The study encompassed a cohort of 10,708 middle-aged, healthy individuals of European descent from the Fenland study, where the SomaScan (SomaLogic) platform was used to measure plasma levels of 4,775 distinct proteins.
We then computed protein genetic scores (protPRS) using PRSice version 255. PRSice computes protPRS by summing all pQTL alleles carried by participants, weighting them by the pQTL allele effect size estimated in the Pietzner et al. study, and normalizing the score by the total number of pQTLs identified for each protein. The pQTLs included in the protPRS computations were those at a suggestive genome-wide level of significance (p-value<5x10-5).
A total of 13 protPRSs were computed for 11 plasma proteins; 6 protPRSs for factors associated with brain aging and an additional 6 for factors linked to brain rejuvenation in mice. Summary statistics data of pQLTs associated with CCL11 were not available in the Pietzner. et al study, thus the protPRS of CCL11 was not computed. For the proteins haptoglobin (HP) and Brain-Derived Neurotrophic Factor (BDNF), the summary statistics was available for two different Somalogic measurements (two different aptamers were used), thus two different protPRSs for each of the protein have been computed (BDNF 1-2, HP 1-2). protPRSs were calculated in representative genetic variants per linkage disequilibrium block (LD) (clumped variants), using a cut-off for LD of r2 > 0.1 in a 250-kb window. Results were displayed at a restrictive threshold, 5x10-5. Additional information regarding the number of pQTLs included in each protPRSs are reported in the Supplementary Table 1.
Validation of genetic proxies of protein levels
To validate the utility of protPRSs as genetic proxies for plasma protein levels, we employed linear regression models in two independent cohorts (Knight-ADRC and ALFA+). In each cohort, protPRSs were used as predictors against individual plasma protein levels as outcomes, to provide a replication of the predictability of these genetic scores. All models were adjusted for age and sex to control for basic demographic variations. We further refined our analysis through stratified assessments based on sex, Aβ status, and APOE-ε4 carriership. Additionally, to ensure uniformity and comparability in our analysis, we standardized all protPRSs via z-scoring and we also scaled the protein levels primarily to enhance the clarity of visual representations in the outputs.
Associations of protein genetic scores with cognitive composites
To evaluate the associations between protPRSs and cognitive performance, we conducted linear regression analyses across cognitive composites. Each protPRS served as a predictor with cognitive composites as outcomes. Our models were adjusted for age, sex, and years of education to account for potential confounding effects. To enable direct comparisons of beta estimates, protPRSs were standardized using z-scoring techniques.
Further, we conducted stratified analyses to investigate the potential influence of sex, Aβ status (defined as CSF Aβ42/40 ratios below 0.071)49, and APOE-ε4 carriership on the association between protPRSs and cognitive outcomes. The statistical significance of our findings was evaluated at various levels: nominal significance, adjustment for differences between two experimental aging groups (aging/rejuvenating), and corrections for multiple testing using the False Discovery Rate (FDR) method. Results were reported with corresponding significance thresholds (*p<0.05, **p<0.025, ***FDR-corrected p-value <0.05). For the significant genetic proxies, we validated the aptamer recognition of the respective protein by non-parametric Spearman correlation test between Somalogic protein levels and those measured by ELISA immunoassay.
TIMP2 protPRS annotation and gene set enrichment analysis
Significant plasma pQTLs (P<5x10-5), included in the TIMP2-protPRS after clumping, were annotated using the Ensembl Variant Effect Predictor release 11156, based on GENCODE v19 transcripts for the GRCh37 human genome assembly57 (Supplementary Tables 2-3). To further obtain insight into putative biological mechanisms, gene set enrichment analysis of annotated genes was conducted using clusterProfiler58 (FDR-adjusted p-value ≤ 0.05). Furthermore, GTEx v8 eQTL data was used to establish tissue-specific gene expression patterns and their relevance to the TIMP2 protein levels measured in our pQTL analysis (GENE2FUNC function59 FUMA v1.5.2). By mapping eQTLs from GTEx to the genomic locations identified in our TIMP2 pQTL analysis, it was assessed whether variations that affect TIMP2 plasma protein levels also influence significant differential gene expression in a tissue-specific manner across 54 tissue types.