Study Design
A total of 12,060 HCs and 3,245 patients with various neurological diseases were included. The patients included individuals with neurodegenerative (including mild cognitive impairment [MCI, n = 212], AD [n = 467], and Parkinson's disease [PD, n = 1,263]), cerebrovascular (CSVD [n = 498]), and neuroinflammatory (including MS [n = 497] and aquaporin-4 antibody-seropositive neuromyelitis optica spectrum disorder [NMOSD, n = 308]) diseases. Additional details can be found in eMethods, eFigure 1 and eFigure 2.
To quantify the individualized brain structural deviations of patients with neurological diseases, we first established a set of normative references using global and regional brain structural measures obtained from 3D T1-weighted images of healthy Chinese individuals aged 4.5 to 99 years. Briefly, a total of 228 brain structural measures were extracted at both the global and regional levels. The global measures included volumes of cortical and subcortical gray matter (GM), white matter (WM), cerebrospinal fluid (CSF), the cerebellum, and the brainstem, along with the mean cortical thickness and total surface area. Regional measures included cortical and subcortical volume, cortical thickness, and surface area of regions determined according to the Desikan–Killiany atlas. The deviations (centile scores) of these structural measures from the established normative references were calculated for individuals in the HC and neurological disease groups.
Three clinical tasks were performed to evaluate the individualized utility of these deviation scores, as described in the Introduction. In addition to the clinical tasks, the stability of the deviation scores was assessed by comparing the baseline and follow-up MR scans. Sensitivity analyses were conducted, including (1) comparison of the fitted Chinese normative references using different models, (2) comparison of male and female normative references, and (3) comparison of the clinical tasks using other measures, including Z scores derived from the Chinese normative references and raw brain structural measures (the outputs of FreeSurfer segmentation). A concise overview of the study design is provided in Fig. 1.
Finally, we developed a clinically applicable individualized brain health report. This report incorporates the findings from the assessment of deviation scores and provides personalized insights into the structural characteristics of the brain. The report serves as a valuable tool for clinicians and patients alike, offering actionable information for improving diagnosis, treatment planning, and monitoring of neurological conditions.
Chinese Normative References and Deviations of Patients with Neurological Diseases
We first established Chinese population-specific normative references of global and regional brain structural measures using 3D T1-weighted images (eMethods, Fig. 2 and eFigure 3) obtained from 12,060 healthy Chinese individuals (age range: 4.5–99 years; mean ± SD age, 37 ± 18 years; 5,987 females [49.6%]) (Table 1). This task was accomplished with the generalized additive models for location, scale, and shape (GAMLSS) package in R. This approach parallels that of a previous study12, in which normative references were developed on the basis of neuroimaging data from 101,457 healthy individuals primarily of European and North American descent. For clarity, we refer to these references as international references, facilitating direct comparison with the Chinese normative references established in our study. Second, we calculated deviation scores (centile scores) that benchmarked each individual scan against these normative references for both the HC and disease groups (comprising 3,245 patients with a mean age of 56 ± 16 years; 1,780 female patients [54.9%]) (Table 1). The deviation scores of the disease groups are summarized in Fig. 3. Additional analyses of these deviation scores in the disease groups using Cohen’s d, the overlapping coefficient and the univariate predictive ability are provided in eResults 1, and more details on the findings in the different neurological diseases are shown in eFigures 4–11.
Table 1
Demographic and clinical information of the included healthy controls and patients with neurological diseases.
| Overall (n = 15305) | HC (n = 12060) | MCI (n = 212) | AD (n = 467) | PD (n = 1263) | CSVD (n = 498) | MS (n = 497) | NMOSD (n = 308) | p |
Age (mean (SD)), years | 41.15 (19.19) | 37.10 (18.03) | 63.30 (9.80) | 66.49 (11.60) | 61.21 (10.53) | 59.32 (11.82) | 36.09 (12.47) | 42.46 (14.21) | < 0.001 |
Sex (n, %) | | | | | | | | | < 0.001 |
Female | 7767 (50.7) | 5987 (49.6) | 127 (59.9) | 279 (59.7) | 569 (45.1) | 197 (39.6) | 325 (65.4) | 283 (91.9) | |
Male | 7538 (49.3) | 6073 (50.4) | 85 (40.1) | 188 (40.3) | 694 (54.9) | 301 (60.2) | 172 (34.6) | 25 (8.1) | |
Education years (mean (SD)) | 12.35 (4.58) | 14.03 (4.33) | 10.83 (4.26) | 10.19 (4.13) | 10.81 (4.49) | 10.48 (3.81) | 13.40 (3.39) | 11.45 (3.68) | < 0.001 |
MMSE (mean (SD) | 25.50 (5.38) | 27.89 (2.73) | 25.75 (3.11) | 17.02 (7.62) | 25.63 (4.15) | 23.35 (4.77) | 26.35 (6.18) | 27.77 (2.48) | < 0.001 |
MoCA (mean (SD)) | 22.21 (5.74) | 24.94 (3.14) | 20.75 (3.94) | 12.53 (7.03) | 21.23 (5.03) | 20.03 (5.28) | 26.61 (3.16) | 24.35 (4.45) | < 0.001 |
BVMT (mean (SD)) | 36.76 (14.37) | 43.47 (11.94) | | | | | 37.99 (14.17) | 33.75 (14.44) | < 0.001 |
CVLT (mean (SD)) | 82.65 (28.94) | 103.73 (22.33) | | | | | 80.50 (29.18) | 78.21 (27.97) | < 0.001 |
SDMT (mean (SD)) | 49.57 (16.64) | 52.71 (17.36) | | | | | 49.14 (13.72) | 41.92 (14.90) | < 0.001 |
PASAT (mean (SD)) | 42.74 (12.89) | 46.17 (13.72) | | | | | 43.08 (10.50) | 37.79 (12.33) | < 0.001 |
EDSS (median [IQR]) | 3.00 [1.50, 4.00] | | | | | | 2.00 [1.00, 3.50] | 3.50 [2.00, 5.00] | < 0.001 |
SVD score (median [IQR]) | | | | | | 3 [1, 4] | | | |
UPDRS-III (medication-off, mean (SD)) | | | | | 49.41 (17.32) | | | | |
UPDRS-III (medication-on, mean (SD)) | | | | | 23.74 (13.49) | | | | |
UPDRS-III after DBS therapy (DBS-on and medication-off, mean (SD)) | | | | | 27.9 (15.64) | | | | |
Relapse number (median [IQR]) | 2.00 [2.00, 4.00] | | | | | | 2.00 [1.00, 3.00] | 2.00 [2.00, 4.00] | 0.36 |
Follow-up time (mean (SD)) years | 3.97 (1.89) | | | | | | 4.27 (2.07) | 3.50 (1.46) | < 0.001 |
Follow-up EDSS (median [IQR]) | 2.50 [1.50, 4.00] | | | | | | 2.00 [1.00, 4.00] | 3.00 [1.50, 4.50] | 0.01 |
EDSS progression, n | 306 | | | | | | 186 | 120 | |
Progression (n, %) | 93 (30.4) | | | | | | 56 (30.1) | 37 (30.8) | 0.96 |
SPMS conversion, n | 186 | | | | | | 186 | | |
Conversion (n, %) | 28 (15.0) | | | | | | 28 (15.0) | | |
Note: n, number; SD, standard deviation; IQR, interquartile range; HC, healthy control; AD, Alzheimer's disease; PD, Parkinson's disease; CSVD, cerebral small vessel disease; MS, multiple sclerosis; NMOSD, aquaporin-4 antibody-seropositive neuromyelitis optica spectrum disorder; MMSE, Mini-Mental State Examination; MoCA, Montreal Cognitive Assessment; BVMT, Brief Visuospatial Memory Test; CVLT, California Verbal Learning Test; PASAT, Paced Auditory Serial Addition Test; SDMT, Symbol Digit Modalities Test; UPDRS-III, Unified Parkinson's Disease Rating Scale Part III; DBS, deep brain stimulation; EDSS, Expanded Disability Status Scale; SPMS, secondary progressive multiple sclerosis. Categorical data are presented as percentages and were compared with Pearson’s chi-square test. Continuous data are presented as the means and standard deviations and were compared with one-way analysis of variance (ANOVA). Ranked data are displayed as medians and interquartile ranges and were compared with the Kruskal‒Wallis test. |
For the brain global structural measures (Fig. 2), the data fit with our constructed normative references presented adjusted R-squared values ranging from 0.34 to 0.59, which were all higher than those (0.11–0.52) obtained with the calibrated international references. The Chinese normative references (aged from 4.5 to 99 years) presented increases in cortical GM volume (GMV), subcortical GMV (sGMV), WM volume (WMV), mean cortical thickness and total surface area from childhood onward, peaking at 8.2 [95% confidence interval [CI] 7.9–8.4], 17.0 [95% CI 16-17.8], 38.6 [95% CI 38.1–39], 7.0 [95% CI 6.5–7.3] and 12.5 [95% CI 10.2–13.3] years, respectively, followed by a near-linear decrease. These peaks were observed 2.3, 2.6, 9.9, 4.5 and 0.7 years later, respectively, than those of the international references. The normative curves revealed increases in the CSF volume from childhood onward and an exponential trend in the sixth decade of life, which is consistent with the international reference. Additionally, we reported the trajectories of the volumes of two structures (the cerebellum and brainstem), which were not reported in the international references. The normative curves revealed increases in the volumes of the cerebellum and brainstem from childhood onward, peaking at 14.4 [95% CI 13.4–15.4] and 30.8 [95% CI 29.5–31.9] years, respectively. Additional findings on regional structural measures are provided in eFigure 3.
We observed that patients with neurological diseases showed significant deviations from the normative references in all global measures, except for GMV, CSF, and total surface area in MCI patients; WMV in PD patients; and total surface area in CSVD patients (Fig. 3). With respect to regional deviations, we observed that MCI patients presented significant deviations in subcortical volume and cortical thickness and less involvement of cortical volume or surface area. AD patients showed significant widespread deviations in all the global and regional structural measures. PD patients showed significant deviations in subcortical volume, cortical volume and cortical thickness and less involvement of surface area. CVSD patients showed significant deviations in subcortical volume and cortical thickness and less involvement of cortical volume or surface area. MS patients presented significant widespread deviations in subcortical volume, cortical volume and cortical thickness and less involvement of surface area. NMOSD patients presented significant deviations in subcortical volume, cortical volume and surface area and less involvement of cortical thickness.
Clinical Tasks Using Individualized Deviation Scores in Patients with Neurological Diseases
Task 1: Estimation of Neurological Disease Propensity Scores by Deviation Scores
To illustrate the resemblance of deviation scores to specific diseases, we estimated the DPS (ranging from 0 to 1), reflecting the similarity between the deviation scores of HCs and those of patients with various neurological diseases. For DPS calculation, we initially employed a support vector machine (SVM) with a radial basis function kernel, following feature reduction by LASSO regression, to formulate disease-specific classification models. These models aimed to distinguish between age-, sex-, and site-matched HCs and each disease group. Model performance, measured by the area under the curve (AUC), was assessed using 10-fold cross-validation. We subsequently utilized these disease-specific models to predict individual DPSs for both HCs and patients with neurological diseases. Predictions for all the cases were conducted using 10-fold cross-validation for predictive robustness. A higher DPS generated by a disease-specific model indicates a greater likelihood of individuals being associated with that particular disease. The DPS serves as a valuable tool for disease screening in HCs and facilitating differential diagnosis of neurological diseases.
Using multivariate global and regional deviation scores, disease-specific classification models achieved AUCs of 0.82 [95% CI 0.79–0.86] for MCI, 0.92 [95% CI 0.91–0.94] for AD, 0.89 [95% CI 0.87–0.90] for PD, 0.81 [95% CI 0.78–0.84] for CSVD, 0.87 [95% CI 0.85–0.89] for MS, and 0.92 [95% CI 0.90–0.94] for NMOSD (Fig. 4a).
We utilized disease-specific classification models to estimate the DPSs for all individuals. Notably, the median estimated DPS values for each disease were as follows: 0.84 (interquartile range [IQR] 0.76–0.86) for MCI, 0.94 (IQR 0.90–0.97) for AD, 0.92 [IQR 0.87–0.94] for PD, 0.84 [IQR 0.63–0.89] for CSVD, 0.90 [IQR 0.81–0.93] for MS, and 0.95 [IQR 0.93–0.96] for NMOSD (Fig. 4a).
Task 2: Prediction of Cognitive and Physical Scores in Patients with Neurological Diseases by Deviation Scores in a Cross-sectional Design
To evaluate the ability of these deviation scores to predict disease-associated clinical variables, we employed support vector regression (SVR) with 10-fold cross-validation to assess how well the deviation scores predict baseline cognitive scores, Unified Parkinson’s Disease Rating Scale Part III (UPDRS-III) scores and Expanded Disability Status Scale (EDSS) scores across patients with various neurological diseases in cross-sectional datasets. The predictive performance was assessed by Pearson’s correlation (ranging from − 1 to 1) and the mean absolute percentage error (MAPE, ranging from 0 to +∞) between the predicted and actual variables. A higher Pearson’s correlation (r value) and lower MAPE indicate better predictive performance of the deviation scores. Furthermore, to ensure the robustness of our findings, we conducted multiple corrections using the false discovery rate (FDR) across the diseases and clinical variables to minimize the risk of false positives.
The correlations between the predicted and actual variables were between 0.25 and 0.70 for the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) scores across HCs and all disease groups; between 0.13 and 0.67 for the Brief Visuospatial Memory Test (BVMT), California Verbal Learning Test (CVLT), Symbol Digit Modalities Test (SDMT) and Paced Auditory Serial Addition Test (PASAT) scores across HCs, MS patients and NMOSD patients; 0.24 for the UPDRS-III score in PD patients not receiving any interventions; and 0.27 and 0.42 for the EDSS score in MS patients and NMOSD patients. These correlations survived FDR correction with pFDR < 0.05, except for SDMT in patients with NMOSD, with pFDR = 0.17. The corresponding MAPEs ranged from 0.07 to 1.54. Details are provided in Fig. 4b.
Task 3: Assessment of the Treatment Response and Disease Progression of Patients with Neurological Diseases by Deviation Scores in a Longitudinal Design
To evaluate the predictive potential of deviation scores for follow-up clinical outcomes, we conducted SVM and Cox proportional hazards regression analyses for treatment response in PD patients and for stratifying MS patients and NMOSD patients with different degrees of disability progression.
(1) Predicting the Treatment Response of PD Patients Receiving medication and DBS
For PD patients, we employed an SVM with 10-fold cross-validation to predict the UPDRS-III score (n = 461) both during medication treatment and after one year of DBS therapy. The motor improvement rate represented by the UPDRS-III score, with patients who did not receive any intervention as a reference, was also assessed. The predictive model assessment procedure was the same as that in Task 2.
The correlations between the predicted and actual variables were 0.31 [95% CI 0.24–0.37] (pFDR < 0.001) and 0.22 [95% CI 0.16–0.29] (pFDR < 0.001) for the UPDRS-III score and its improvement rate when receiving medication and 0.28 [95% CI 0.19–0.36] (pFDR < 0.001) and 0.16 [95% CI 0.07–0.24] (pFDR < 0.001) for the UPDRS-III score and its improvement rate after one year of DBS therapy (Fig. 4c). The corresponding MAPEs ranged from 0.46 to 1.17.
(2) Stratifying MS and NMOSD Patients with Different Degrees of Disability Worsening at Follow-up
For MS and NMOSD patients, disability worsening, which was defined according to the difference between the baseline and follow-up (MS, n = 106, mean [SD] follow-up time = 51.24 ± 24.84 months; NMOSD, n = 82, mean [SD] follow-up time = 41.0 ± 17.52 months) EDSS scores31, was assessed. To stratify individuals with different degrees of disability worsening, we first conducted multivariate Cox proportional hazards regression (stepwise bidirectional elimination) using the deviation scores (conversion to secondary progressive MS from relapsing-remitting MS was also analyzed) using 10-fold cross-validation. Using the predictive relative hazard, we stratified the patients into subgroups of low- and high-risk patients. The cutoff value was the median predicted relative hazard of the corresponding disease. Kaplan‒Meier analysis with the log-rank test was conducted to investigate the potential differences in the survival curves of the stratified subgroups (Fig. 4d). These analyses for MS and NMOSD patients using deviation scores can provide information to identify patients at high risk for disease worsening and could aid in clinically stratified management of new patients. Additional analyses of the univariate and multivariate Cox proportional hazards regression results for disability worsening in MS patients and NMOSD patients are provided in eTable 1.
A Kaplan‒Meier analysis of the subgroups stratified by the predicted risk of disability worsening revealed that MS patients with a higher predicted relative hazard had a greater risk of conversion to secondary progressive MS (log-rank p = 0.0090) but had no difference in EDSS progression compared with those with a low predicted relative hazard (log-rank p = 0.99). NMOSD patients with a higher predicted relative hazard had a greater risk of disability progression than did those with a low predicted relative hazard (log-rank p = 0.0053).
Additional Stability and Sensitivity Analyses of Deviation Scores
(1) Assessment of the Stability of Deviation Scores Using Longitudinal MRI Scans
To assess the stability of the global and regional deviation scores derived from normative references and their potential role in monitoring disease development, additional longitudinal assessments of the deviation scores using independent longitudinal scans in HCs and patients with neurological diseases (follow-up times ranging from 0.5 to 3 years; HCs, n = 73; MCI, n = 42; AD, n = 56; MS, n = 42; and NMOSD, n = 32) were conducted. The IQRs of the deviation scores of the baseline and follow-up scans for each patient were used to assess the stability of the deviation scores, similar to a previous study12. A large IQR indicated a large difference between the longitudinal scans of that patient. A linear mixed model with individual as a random effect for each deviation score in each group was conducted to monitor the disease progression reflected by the potential longitudinal changes in deviation scores.
IQRs of the longitudinal deviations across the global and local structural measures were calculated in both the HC and disease groups for patients for whom longitudinal scans were available (eResults 2 and eFigure 12). The median IQR across ages was 0.09 (IQR = [0.05, 0.16]), indicating a stable quantification of the deviation scores of these patients. Larger IQRs appeared to be more prevalent in younger and older populations but remained stable in the middle-aged population. This finding indicated that the variability in brain structural changes may be specific to different age groups. Furthermore, we analyzed the changes in the deviation scores at follow-up compared with the baseline scores. Compared with those of the baseline scans, the deviation scores of several structural measures (e.g., the left lateral orbitofrontal area in AD patients, the right lingual thickness in MS patients and the left lingual volume in NMOSD patients) were lower for the follow-up scans (p < 0.05), but the differences did not survive FDR correction, with all pFDRs > 0.05.
(2) Sensitivity Analysis of Deviation Scores in Different Models, Male and Female References, and Utilization of Other Comparative Quantitative Measures
To test whether there were differences in the normative curves using different models, disease-related findings using female and male references, and different quantitative measures, we performed sensitivity analyses to compare normative curves based on our reference and international references, calculate deviation scores using separate male and female references, and utilize other comparative quantitative measures (Z scores derived from the normative references and raw brain structural measures).
For the normative reference comparisons (eResults 3 and eFigure 13), we assessed the consistency of our normative references (median value curves) with international references, as well as the site-calibrated international references, using Pearson’s correlation and mean absolute scaled error (MASE). Our normative references, international references and site-calibrated references showed high consistency, with a minimum correlation greater than 0.9 and the lowest correlation for the WMV curves. The normative references of cortical thickness showed the largest MASE, indicating a large difference between our normative reference and the international reference.
The normative references stratified by sex are presented in eResults 4 and eFigures 14–16. Compared with females, healthy male populations presented greater median values of brain volume, cortical thickness and total surface area across the lifespan, and supratentorial brain tissue volume and total surface area were significant. Comparisons between disease patients and HCs stratified by sex revealed similar global and regional deviation scores across diseases. However, the global and regional deviations of the male subgroups of MCI, MS and NMOSD patients were less statistically significant.
In the comparison of different measures (eResults 5 and eFigures 17–24), the analyses revealed that these indices had comparable abilities for the detection of measures that were statistically significant between disease groups and HCs. More than 60% of the detected MRI measures overlapped with these quantitative measures, implying that a combination of these methods may provide robust brain structural markers in patients with neurological diseases. For DPS estimation, the disease-specific classification model focused more on the corresponding disease group using the centile score than on the raw brain structural measures (eFigure 22). The centile score outperformed the raw brain structural measures in stratifying MS and NMOSD patients with different degrees of disability worsening (eFigure 24).
Clinically Applicable Brain Health Report for Individuals by Deviation Scores
Finally, we developed a clinically applicable brain health report for individuals on the basis of the global and regional deviation scores (Fig. 5). In this brain health report, we first provided global deviation scores for GMV, sGMV, WMV and cerebellum and brainstem volumes in the framework of normative references. We then provided regional deviation score maps of subcortical and cortical volumes. The summarized tables for these global and regional deviation scores are provided, where deviation scores lower than 5% or higher than 95% of the normative references are highlighted. Additionally, we provided the DPS as a reference for clinical diagnosis.