Chemicals
HPLC-grade acetonitrile (ACN), methanol, and formic acid purchased from Sigma-Aldrich (St. Louis, MO, USA). Deionized water purified through a Milli-Q® purification system from Merck (Millipore, Bedford, MA, USA). Other chemicals, reagents, and solvents used were all of the analytical grades.
Rat experimental protocol
All animal experiments were performed in accordance with the applicable Chinese legislation and approved by the Ethics Committee of Shanxi Medical University, PR China. Sprague-Dawley (SD) rats, weighing 180~220 g, 10~12 weeks old, were supplied by Animal Center of Shanxi Medical University. The rats were housed in cages with rat chow and water under a 12-h light-dark cycle at room temperature (22 to 24°C) and were fasted overnight before the experiment.
All animals were randomly divided into 3 groups (n = 10): control group, sham group, and AMI group. The rat model of AMI was established according to conventional coronary ligation[19]. Briefly, rats were anesthetized with intraperitoneal administration of 3% pentobarbital sodium (30 mg/kg), and the lead II electrocardiogram (ECG) was monitored using a BL-420 biological functional experimental system (Chengdu Technology & Market Co. Ltd, China). Then rats were endotracheal intubated and ventilated with a small animal ventilator (HX-100E, Chengdu Technology & Market Co., Ltd, China). A left thoracotomy was performed, and the heart was exposed. The left coronary artery was ligated approximately 5 mm from the lower margin of the left auricle. The left ventricle apex for myocardial blanched and ST segment of ECG was elevated, which indicated that ligation of coronary artery occlusion MI was successful. After 1h of myocardial ischemia, rats were over-anesthetized with pentobarbital sodium to death. Sham-operated rats underwent a similar process without ligation of the left coronary artery. The control group received no treatment.
Blood samples were withdrawn from the rat's abdominal aorta and centrifuged at 12000 rpm for 15 min at 4°C. The supernatant serum samples were aliquoted and immediately stored at -80°C until analysis.
Human blood samples collection by forensic autopsy
This study was approved by the Ethics Committee of Shanxi Medical University, PR China, and all samples were analyzed anonymously. A total of 17 blood samples from forensic autopsy cases, in which 9 cases were confirmed of cardiac cause of death and 8 of noncardiac sudden death in the autopsy. The causes of death of these forensic cases were determined by professional forensic pathologists through systematic forensic autopsies (including macromorphological, histological, toxicological and biochemical examinations) in combination with the death circumstances and medical history of the victims. The forensic autopsies were performed by the Department of Forensic pathology, Shanxi Medical University. Written informed consent statements were acquired from the family member of the deceased individuals.
Sample preparation
Serum samples were thawed before extraction. Then, 800 μL of cold acetonitrile was added into 200 μL of serum to remove protein. After vortex mixing for 1 min and centrifugation (12,000 rpm, 20 min, at 4 °C), 600 µL of the supernatant was withdrawn and freeze-dried in a freeze concentration centrifugal dryer (NingBo XinZhi. Ltd, China). Finally, the residues were dissolved with 200 µL acetonitrile/water (4:1) solution, and filtered by 0.22 µm membrane for UPLC-HRMS analysis. A quality control (QC) sample was prepared by pooling and mixing equal-volume sub-aliquots of all samples to monitor the stability of analytical method and system.
UPLC-HRMS analysis of serum samples
The UPLC-HRMS analysis was performed with the Thermo Scientific Ultimate™ 3000 UHPLC system coupled to a Thermo Scientific Q Exactive™ Orbitrap high-resolution mass spectrometer (Thermo Scientific, San Jose, CA, USA) which could acquire the MS2 information in a single sample run. Chromatographic separation was performed on an Acquity HSS T3 column (1.8µm, 2.1mm×100 mm, Waters). The column was kept at 40℃, and the injection volume was 5µL. Mobile phase consisted of 0.1% formic acid in water (v/v; A) and 0.1% formic acid in acetonitrile (v/v; B). The flow rate was 0.3 mL/min, with the elution gradient as follows: 0~5 min, 2% B; 5~13 min, 30%B; 13~15 min, 85% B; 15~17min, 98% B; 17~17.5 min, 2% B; and re-equilibration until 20.5 min.
The critical parameters of mass spectrometry detections were performed as follows: capillary temperature was 350 °C, and spray voltages were 3.5 kV and 3.0 kV for positive ion mode and negative ion mode, respectively. The mass scan range was from 80 to 1200 Da. Scanning mode is Full Scan/dd-MS2, and the mass resolution was set to 70 000. The resolution is MS Full Scan 35 000 FWHM, MS/MS 17 500 FWHM, NCE is 12.5, 25, and 37.5 eV.
Data preprocessing
The acquired raw data files (.raw) were imported into Compound Discoverer 3.0 (Thermo Fisher, CA, USA) for initial data processing, including peak integration, nonlinear retention time alignment, filtering, matching, etc. Simultaneously, the compounds in the serum were annotated. These metabolic discoveries were achieved using a combination of open online databases (mzCloud, HMDB, etc), local databases, and MS/MS metabolites data greatly improves the accuracy of metabolite identification. The final output data includes compound name, retention time, exact mass-to-charge ratio, peak area, etc. All data were imported into Excel to normalize the peak area.
Statistical analysis of UPLC-HRMS data
All normalized metabolomic data matrices were imported into SIMCA-P14.0 software (Umetrics, Malmö, Sweden), and multivariate data analysis was carried out. Principal component analysis (PCA) was used to observe general clusters and outliers. Subsequently, the data were subjected to partial least squares-discriminant analysis (PLS-DA) and orthogonal partial least squares-discriminant analysis (OPLS-DA) where models were built and utilized to identify and reveal differential metabolites accountable for the separation between identified groups. Simultaneously, 200 times response permutation testing and P-values (P-value<0.05) from CV-ANOVA were performed to evaluate the quality of PLS-DA model and OPLS-DA model.
Furthermore, Mann-Whitney U-test (P<0.05) was used to evaluate the differences of metabolites using SPSS 24.0 software (IBM Corp., Armonk, NY, USA). The potential metabolites were selected according to their corresponding variable importance in the projection (VIP) values of these OPLS-DA models and P value of Mann-Whitney U-test.
Machine learning algorithms and feature selection
To screen more important biomarkers and establish the best mathematical classification model for AMI diagnose, we adapted a representative set of five machine learning algorithms that were applied widely in metabolomics: GTB, SVM, RF, LR, and MLP. Before analysis, Z-score data standardization was used to reduce sample variation:
Z = (X-μ)/ σ
Where X is the peak area of each metabolite, μ is the average peak area from each group, X-μ is the mean deviation, and σ is the standard deviation.
Python software (Intel Corporation, Santa Clara, CA, USA) was employed to develop mathematical models and tune the parameters based on the five machine learning algorithms. The essential features (metabolites) were selected and ranked based on their contribution to each model. Borda count algorithmwas applied to summarize all five ranks in order to obtain the final importance rank of metabolites[20]. Ten-fold cross-validation method and average values of area under the curve (AUC) in multivariate receiver operating characteristic (ROC) curve were used to screen the highest performance classification model, and the metabolites in this model were known as the biomarker candidates. The boxplots of biomarkers were prepared using GraphPad Prism7 (GraphPad Software, La Jolla, CA, USA).
Predictive model construction and performance assessment
The best performing model was used as a predicting model for AMI. We randomly split all samples into 70% training set and 30% testing set to assess the overall performance of the model. The 70/30 split is a common practice of splitting ratio for samples of a moderate size in the machine learning applications. Predictive power was assessed by confusion matrices and ROC curves associated with AUC values. Additionally, the metabolomics data of the autopsy cases were used for the external validation set to evaluate the performances of the predicting model for AMI.