Clinical characteristic of study participants
The clinical characteristics of all participants are presented in Table 1. Compared with normal pregnant women, pregnant women who developed GDM had higher FPG, 1-h, and 2-h blood glucose levels after an OGTT and smaller changes in BMI during pregnancy, in both the second and third trimesters. HbA1c, TG, TC, and LDL were statistically different between the case and control groups in the second trimester, which were consistent with previous research conclusions. TC, delivery gestational week, neonatal weight, and neonatal length were statistically different between GDM patients and healthy pregnant women in the third trimester. Women with GDM were more likely to have a family history of diabetes. Age, BMI, and gestational week were similar between the two groups in each trimester.
Metabolomics profiling of study participants
As shown in the metabolomics profiles (Figures 2a, b), the case group (GDM) did not separate from the control group (normal) in the second and third trimester with PCA; moreover, PLS-DA showed a relatively clear discrimination for both the trimesters (Figures 2c, d). Finally, the results of OPLS-DA (Figures 2e, f) indicated the possibility of evaluating the differences between GDM patients and normal controls with metabolite abundance. The results showed that R2 and Q2 of the OPLS-DA model in the second trimester were 0.347 and 0.165, while in the third trimester were 0.324 and 0.201, respectively. Permutation tests (n = 200) were employed for validating the predictive ability of the built OPLS-DA models (Figure S1). The R2 and Q2 values derived from the permuted data were lower than the original values, which demonstrated that the OPLS-DA model did not overfit. The VIP values of the OPLS-DA model and criteria, including |log2FC| >0 with p <0.05, were further employed for determining the DEMs. A volcano plot provided a quick way to display the differences in metabolite expression levels between normal pregnant women and GDM patients with statistical significance (Figure 3). As shown in the volcano plot, of 200 metabolites found in this study, 57 metabolites in the second-trimester group (Table S1) and 72 metabolites in the third-trimester group (Table S2) were considered DEMs. The top five DEMs in the second-trimester group were 3-methyl-2-oxovaleric acid, 3-hydroxybutyric acid, palmitic acid, alpha-hydroxyisobutyric acid, and acetic acid. As for the third-trimester group, they were ketoleucine, alpha-ketoisovaleric acid, pyruvic acid, L-tryptophan, and succinic acid. Most DEMs were down-regulated during the second trimester (Figure 3a), while many DEMs were positively-regulated during the third trimester (Figure 3c), suggesting different metabolomic dysfunction at different trimester stages.
Metabolic enrichment of biological function and pathway relevant to GDM
The DEMs for each comparison group were evaluated using enrichment analysis with SMPDB (Figures 4a, b). In the second-trimester group, the alpha linolenic acid and linoleic acid metabolism pathways had the highest fold enrichment, lowest p value (p <0.001), and FDR of <0.1. Other significant functions included beta oxidation of very long chain fatty acids and valine-leucine-isoleucine degradation (Figure 4a and Table 2, Table S3). In the third-trimester group, functions such as urea cycle, ammonia recycling, glycine and serine metabolism, valine-leucine-isoleucine degradation, arginine and proline metabolism, alanine metabolism, glutamate metabolism, aspartate metabolism, glucose-alanine cycle, phenylalanine and tyrosine metabolism, and carnitine synthesis were significantly associated with the corresponding DEMs (Figure 4b and Table 3, Table S4).
Pathway analysis was also performed for investigating the function of DEMs. 32 pathways were observed, nine of which were significantly enriched in the second-trimester group (Figure 4c, Table S5), among which two pathways, valine-leucine-isoleucine biosynthesis and valine-leucine-isoleucine degradation, played key roles in reflecting the changes in metabolites. For the third-trimester group, 48 pathways were found, of which 21 were significantly enriched with DEMs (Figure 4d, Table S6).
Furthermore, remarkable differences also exist between different trimester stages on function and pathway levels (Figures 4 and S2), suggesting that stage-specific biomarkers and diagnostic models should be considered.
Selection of potential metabolic biomarkers for GDM
After observing metabolomics differences between the groups and reliable functional enrichment analysis, it was necessary to establish a diagnostic model for predicting the presence of GDM in pregnant women and for selecting potential metabolite characteristics with the importance determined using machine learning algorithms [16, 17].
For the second-trimester group, samples were first divided into training data with 70% samples and test data with 30% samples. The RF model was then learned on the training data for obtaining the importance score for each metabolite, based on which the candidate metabolite biomarkers were selected with top large importance scores and with overlaps to DEMs as much as possible. Next, the LR model was constructed on the training data with such metabolite biomarkers (Table S7), with a training AUC of 0.808. Finally, the second-trimester group-specific LR model was validated on the test data and achieved a testing AUC of 0.807 (Figure 5a). Similarly, for the third-trimester group, the metabolite biomarkers (Table S7) were selected using RF, and an LR model was built, which had learning performance as AUC 0.819 and validation performance as AUC 0.810 on the test data (Figure 5b). In addition, almost all identified biomarker candidates were DEMs (Table 4).
Clinical relevance of metabolic biomarkers for GDM
Finally, to confirm the clinical relevance of metabolic biomarkers associated with GDM, WGCNA was performed for inferring the association between metabolite modules and clinical indices. As shown in Figure 6a, four modules were detected for the second-trimester group. Module turquoise was significantly associated with GDM (i.e., group index) and many other important clinical indices, including pre-pregnancy BMI, OGTT, TC, TG, and LDL. Additionally, this module significantly included 11 DEMs (P = 9.99e−04), compared to module blue containing three DEMs (P = 0.666), module brown containing two DEMs (P = 0.580), and module grey containing 41 DEMs (P = 0.868) (Figure 6c). Similarly, five modules were found for the third-trimester group (Figure 6b), where the module turquoise contains 21 DEMs (P = 6.737e−08), module blue contains five DEMs (P = 0.944), module brown contains six DEMs (P = 0.090), module grey contains 37 DEMs (P = 0.992), and module yellow contains three DEMs (P = 0.606). At this time, module turquoise was associated with group, OGTT, TC, TG, and LDL; and module yellow was positively associated with the group, pre-pregnancy BMI, and OGTT. It should be noted that the biomarker candidates are almost DEMs (Table 4); thus, they have similar contribution to particular modules and their associations with GDM.