Experimental design and sample information
The AIS and ICH samples were split into the train and test sets. In brief, train set consisting of AIS (n = 45) and ICH (n = 25) were used in multivariate statistical and student t-test analysis to identify the primary candidate markers. As a critical step in the development of any new biomarker, support vector machine algorithm was used to evaluate the verification of candidate markers for distinguishing stroke subtypes that are not included in the train set (i.e., test set). As for the candidate markers after verification, we will evaluate their changing trends in health, AIS AIS,ICH groups. Subsequently, dysregulation of the core pathways was discovered by pathways enrichment analysis. All cases of intracranial ischemia and hemorrhage involved in this study were confirmed by neuroimaging, and the intracranial lesion areas of AIS and ICH plaques could be clearly observed (Fig. 1). Demographic data and clinical biochemical indicators listed in Table 1. Compared with health, AIS and ICH had higer TG LDL BG ,but lower HDL.No significant difference was observed between AIS and ICH in other clinical biochemical indexes. Compared with AIS, ICH had higher NIHSS scores.
Table 1
Demographic data of the enrolled population of the study by group
| | Train set | Test set |
| Health | AIS (n = 45) | ICH (n = 25) | AIS (n = 18) | ICH (n = 10) |
Gender(M/F) | 66(35/31) | 45(26/19) | 25(18/7) | 18(9/9) | 10(5/5) |
Age(ys) | 62.29 ± 11.285 | 65.64 ± 10.474 | 63.20 ± 13.856 | 65.00 ± 9.787 | 70.70 ± 9.623 |
BMI | 22.06 ± 2.488 | 21.89 ± 1.696 | 22.42 ± 2.570 | 22.18 ± 1.454 | 20.57 ± 1.780 |
TC(mmol/L) | 5.20 ± 0.578 | 4.89 ± 1.143 | 5.05 ± 0.974 | 5.18 ± 0.959 | 5.13 ± 0.926 |
TG(mmol/L) | 1.10 ± 0.431 | 1.51 ± 0.725** | 1.32 ± 0.653 | 1.26 ± 0.380** | 1.72 ± 1.246** |
HDL(mmol/L) | 1.74 ± 0.375 | 1.29 ± 0.579** | 1.51 ± 0.617 | 1.29 ± 0.293** | 1.30 ± 0.239** |
LDL(mmol/L) | 2.95 ± 0.522 | 3.02 ± 0.838 | 2.94 ± 0.810 | 3.32 ± 0.858** | 3.05 ± 0.836* |
BG(mmol/L) | 5.19 ± 0.356 | 5.64 ± 0.109* | 6.02 ± 1.732 | 5.89 ± 1.721** | 5.95 ± 1.357** |
UA(mmol/L) | 320.15 ± 78.596 | 334.65 ± 90.739 | 334.83 ± 150.022 | 335.68 ± 125.418 | 294.15 ± 61.719 |
NIHSS Scores | - | 3.33 ± 3.490 | 4.24 ± 4.918 | 2.5 ± 2.833 | 7.1 ± 3.390# |
Gender (male [M])/female [F]), body mass index (BMI), six clinical biochemical indicators (TC: Total cholesterol; TG: Triglyceride; HDL: High density lipoprotein; LDL: Low density lipoprotein; BG: Blood glucose; UA: Uric acid.) and national institute of health stroke scale (NIHSS) scores are indicated. “*”represents P < 0.05(VS Health), “**” represents P < 0.01(VS Health), “#” represents P < 0.01(VS AIS).
Analysis on plasma by gas chromatography coupled mass spectrometry
To efficiently separate and detect compounds in human plasma, the analysis technology of gas chromatography-mass spectrometry based on non-targeted metabolomics was adopted in this study. Through specific analysis procedures, plasma compounds can be separated well (Fig. 2). Next, the mass spectrum of compound was identified according to the standard with a similarity greater than 80% compared to the NIST database. As a result, more than 50 compounds were identified(Table 2). Meanwhile, the inter-day relative standard deviation value of quality control samples is 1.01%, indicating that the instrument and data quality are in a stable state.
Table 2
Plasma analytes were identified by NIST database
RT(min) | Analyte | Similarity | RT(min) | Analyte | Similarity |
2.12 | Valine | 94% | 13.97 | l-Lysine | 90% |
2.29 | Alanine | 95% | 14.78 | Ribitol | 84% |
2.53 | 1,2-Butanediol | 90% | 15.48 | Phosphoric acid | 86% |
2.58 | Ethanedioic acid | 92% | 16.05 | L-Ornithine | 94% |
2.88 | Butanoic acid | 90% | 16.21 | Tetradecanoic acid | 81% |
2.94 | L-Proline | 80% | 16.38 | Pentaric acid | 80% |
3.10 | Pentanoic acid | 84% | 17.06 | D-Fructose | 92% |
3.97 | Urea | 91% | 17.45 | L-Tyrosine | 89% |
4.57 | leucine | 92% | 17.51 | Galactose | 86% |
4.95 | L-Isoleucine | 88% | 18.02 | Palmitelaidic acid | 89% |
5.10 | Glycine | 92% | 18.12 | D-Turanose | 83% |
5.27 | Butanedioc acid | 86% | 18.18 | D-Glucose | 97% |
5.82 | Propanoic acid | 83% | 18.87 | Octadecanoic acid | 90% |
6.33 | 2,3-dihydnxybutanoic acid | 83% | 18.91 | Inositol | 85% |
6.43 | L-Serine | 86% | 18.99 | Uric acid | 85% |
7.00 | L-threonine | 85% | 19.05 | Heptadecanoic acid | 89% |
9.28 | Malic acid | 86% | 19.49 | 9,12-Octadecadienoic acid | 85% |
9.50 | L-Proline | 92% | 19.54 | Oleic acid | 89% |
9.73 | Phenylalanine | 94% | 19.59 | 11-trans-Octadecenoic acid | 86% |
10.27 | Creatinine enol | 87% | 19.63 | L-Tryptophan | 93% |
10.53 | Cystein | 95% | 20.06 | Tetratriacontane | 83% |
10.76 | 2,3,4-Trihydroxybutyric acid | 86% | 20.36 | Nonadecanoic acid | 86% |
11.69 | Acetic acid | 84% | 21.25 | Hexatriacontane | 93% |
11.82 | L-phenylalanine | 94% | 21.45 | 1,2-Benzenedicarboxylic acid | 96% |
12.13 | Glutamine | 92% | 21.85 | Docosanoic acid | 83% |
12.38 | Dodecanoic acid | 86% | 22.10 | alpha-D-Glucopyranoside | 89% |
13.17 | Asparagine | 94% | 23.99 | Cholesterol | 94% |
Primary screening of metabolites to distinguish between AIS and ICH
To further explore the plasma candidate marker between AIS and ICH in train set, Orthogonal partial least squares discrimination analysis (OPLS-DA) model was applied to distinguish the two stroke subtypes. Two scores can be used to evaluate this model. R2Y represents the interpretability of the model. The closer R2Y is to 1, the more information that can explain the classification of the two groups, and the greater the difference between the two groups. At the same time, the model did a self-cross validation to calculate a Q2Y to judge the predictability of the model. The closer Q2Y is to 1, the more predictable the model is, which means the more reliable the model is. In order to further verify whether the model is over-fitted, we adopt the permutation test, which is an external verification method. The intercept of an excellent model R2Y does not exceed 0.3–0.4, and the intercept of Q2Y does not exceed 0.05 (usually negative). In our study, the interpretability and predictability of the model were evaluated: R2Y = 0.978 and Q2 (cum) = 0.973, and the scores plot showed a good separation between AIS and ICH (Fig. 3A). In this study, 20 permutations were used to verify the model internally. If the intercept of R2 in Y axis is less than 0.4 and the intercept of Q2 in Y axis is less than 0.05, it can be considered that the model has not been over-fitted. As shown in Fig. 3B, the model was successfully modeled and there was no over-fitting.Then the Variable Importance in Projection (VIP) value was used to screen meaningful differential metabolites. VIP value is an index to summarize the importance of each variable in driving the observed population separation. VIP > 1 means that the contribution of this variable to the model is greater than the average value. At the same time, the screened differential metabolites should be tested by independent sample t test and the results should have statistical differences(p < 0.05). Eventually, compared with ICH, high level of L-tryptophan, L-phenylalanine, cysteine, L-tyrosine, L-serine, L-threonine, creatinine enol, valine and low level of ethanedioic acid were positive associated with AIS (Fig. 3D). These nine differential metabolites could be viewed as potential biomarkers for discriminating between AIS and ICH.
Pathway enrichment of differential metabolites
Most of the differential metabolites screened above were amino acids, including essential amino acids (i.e., valine, phenylalanine, threonine and tryptophan) and non-essential amino acids (i.e., tyrosine, cysteine and serine). Valine belonged to branched chain amino acids (BSAAs), which was involved in the regulation of cell growth, neurotransmitter synthesis, carbohydrate utilization and lipid metabolism [22]. previous studies also showed that the increase of serum BCAA level was related to dyslipidemia and was positively correlated with the risk of cardiovascular disease [23, 24]. Phenylalanine, tryptophan and tyrosine belonged to Aromatic amino acids (AAA). AAA are precursors for the synthesis of monoamine transmitters (including catecholamines and 5-Hydroxytryptamine), and 5-Hydroxytryptamine (5-HT) can antagonize acetylcholine. The metabolic disorder of AAA may be related to neurotransmitter diseases [25].To better understand the metabolic pathways that those candidate markers were involved in, we further performed pathway analysis using the online software MetaboAnalyst (http://www.metaboanalyst.ca/). The candidate markers that that distinguishing AIS and ICH mainly enriched in phenylalanine metabolism, phenylalanine, tyrosine and tryptophan biosynthesis, glycine, serine and threonine metabolism, aminoacyl-tRNA biosynthesis (Fig. 4A), Specially, phenylalanine, tyrosine and tryptophan biosynthesis was highly impacted pathway, implying that these metabolite markers play important roles in the regulation of the pathway. The interaction of these amino acids in vivo was closely related to tricarboxylic acid cycle (TCA) and glutathione redox (Fig. 4B).
Validation of potential markers distinguishing AIS from ICH
In order to explore the diagnostic effect of those potential markers, we put them into test set to verify. Test set, including 18 AIS and 10 ICH samples, was analyzed in the external validation phase to validate the reliability of the above nine differential metabolites. The statistical significance difference (p value (FDR) < 0.05) between AIS and ICH in test set was required. Finally, nine different metabolites were successfully verified, and their changing trends in the test set were consistent with those in the train set (Fig. 5). Furthermore, we analyzed the changing trend of the above metabolites in healthy people, AIS and ICH. Compared with health, L-tryptophan, L-phenylalanine, cysteine, L-tyrosine, L-serine, L-threonine, creatinine enol and valine were significantly increased in AIS, cysteine, L-tyrosine, L-serine and L-threonine were significantly decreased in ICH (Fig. 6). After defining potential markers, the diagnosis potential of these metabolites was evaluated by support vector machine algorithm. Receiver operating characteristic curve was exploited based on the results of area under the curve, the sensitivity and specificity at best cutoff points. Among them, cysteine, L-tryptophan and L-phenylalanine had preferable ability to distinguish AIS from ICH (Fig. 7). The identification accuracy, area under the curve (AUC), sensitivity and specificity of the above three potential markers all exceeded 0.8. In particular, the AUC of L-phenylalanine is as high as 0.961 and 93.86% of samples could be correctly diagnosed with 88.89% sensitivity and 100% specificity (Table 3), which indicated that L-phenylalanine was able to be used to distinguish AIS from ICH. Heidi Ormstad et al indicated that the ratio of phenylalanine to tyrosine can be used as a diagnostic marker of AIS [26], but this study did not evaluate the contribution of this ratio to the differentiation of stroke subtypes. In our study, we observed that the expression of L-phenylalanine and L-tyrosine in ICH was lower than that in AIS. AS phenylalanine can be hydroxylated to tyrosine, and the ratio of phenylalanine to tyrosine may be a potential marker for distinguishing stroke subtypes. Herein, we evaluated the ability of L-phenylalanine /L-tyrosine to distinguish AIS from ICH, the results of which exhibited higher sensitivity but poor specificity. Therefore, the ratio of phenylalanine to tyrosine was not suitable as a clinical marker to distinguish AIS from ICH.
Table 3
Accuracy, area under curve, sensitivity and specificity of candidate marker was used as diagnostic marker of discriminating AIS from ICH
Analytes | Accuracy | AUC | Sensitivity | Specificity |
| (95%CI) | (95% CI) | | |
Cystein | 0.8571(0.6733–0.9597) | 0.8667(0.7105-1.0000) | 0.8889 | 0.8000 |
L-Tryptophan | 0.8929(0.7177–0.9773) | 0.8667(0.6663-1.0000) | 0.8889 | 0.9000 |
L-Phenylalanine | 0.9386(0.7650–0.9912) | 0.9111(0.7820-1.0000) | 0.8889 | 1.0000 |
L-Phenylalanin/L-Tyrosine | 0.7500(0.5133–0.8631) | 0.8280(0.7800-1.0000) | 1.0000 | 0.3000 |
Qualitative identification of potential markers and their biological activity
Through support vector machine algorithm, we found that cysteine, L-tryptophan and L-phenylalanine had a good ability to distinguish the stroke subtypes. In order to further verify the qualitative results of the three compounds, chemical standards were used as analytical controls. As shown in Fig. 8, the retention time and mass spectrum fragments of the three components in plasma samples were consistent with the standards.