In total, 152 participants were recruited to the study. After exclusion of 3 women who did not provide CGM data, 2 women with antibiotics intake during the study period, 34 women with inaccurate food diaries, 2 women with less than 6 meals left after filtering, and 6 microbiota samples with low read count (<10,000 reads), 105 participants (77 women with GDM and 28 healthy pregnant women) were included in the final analysis (Fig. 1).
The characteristics of the participants are in Table 1. Women with GDM did not differ from the control group in terms of age and gestational age upon initiation of continuous glucose monitoring. Patients with GDM had higher body mass index (BMI) before pregnancy. As expected, healthy pregnant women had lower plasma glucose levels during OGTT and hemoglobin A1C (HbA1C) upon inclusion into the study.
Patients with GDM consumed lower amounts of carbohydrates (28.4 ± 10.9 vs 36.6 ± 10.8 g) and higher amounts of proteins (17.0 ± 5.2 vs 13.8 ± 2.9 g) per meal compared to healthy women (Table 1). Presumably due to this fact, iAUC120 and GLUmax levels did not significantly differ between the groups and even tended to be lower in women with GDM compared to their healthy counterparts who were not dieting (0.52 ± 0.29 vs 0.63 ± 0.28 and 6.2 ± 0.6 vs 6.4 ± 0.6 mmol/L, respectively) (Table 1). For comprehensive details on lifestyle assessments and baseline blood tests, please refer to supplementary Table S3.
Microbial features in women with higher and lower PPGRs
As there was no difference in the levels of GLUmax and iAUC120 between women with and without GDM during CGM, we combined their data for selection of microbial features associated with higher and lower PPGRs. The medians for iAUC120 and GLUmax in the cohort were 0.527 and 6.254 mmol/L, respectively. Participants with median PPGR indices (iAUC120 or GLUmax, respectively) below these numbers were considered to have lower PPGRs, and those with median PPGR indices equal to or above the cohort median comprised the subgroup with higher PPGRs.
Linear discriminant analysis revealed 18 bacterial taxa exhibiting significantly higher scores in the subgroup of women with higher iAUC120 and 21 bacterial taxa with higher scores in the subgroup with lower iAUC120, P < 0.05 for all (Fig. 2). Bacterial taxa displaying notably higher scores in women with higher iAUC120 included Dorea (Lachnospiraceae), Fusicatenibacter (Lachnospiraceae), Ruminococcus torques group (Oscillospiraceae), Prevotella 9 (Bacteroidia, Prevotellaceae), Coprococcus comes (Lachnospiraceae), Roseburia (Lachnospiraceae), “Lachnoclostridium edouardi” (Lachnospiraceae), Marvinbryantia (Lachnospiraceae), Anaerobutyricum hallii (basonym: Eubacterium hallii) (Lachnospiraceae), Colidextribacter (Bacillota). Taxa with a higher score in the subgroup with lower iAUC120 included Oscillospiraceae UCG-002, Muribaculaceae (Bacteroidota, Bacteroidia), Ruminococcus champanellensis (Oscillospiraceae), Christensenellaceae R-7 group (Clostridia), Parabacteroides distasonis (Bacteroidia, Tannerellaceae), Blautia (Lachnospiraceae), Sellimonas (Lachnospiraceae), Eisenbergiella tayi (Lachnospiraceae) and Bilophila wadsworthia (Deltaproteobacteria, Desulfovibrionaceae) (Fig. 2). All bacterial taxa distinguished by LefSe were included to input variables for creation of PPGR prediction models.
When comparing women with higher and lower GLUmax, 7 taxa were enriched in the subgroup with higher GLUmax, including Clostridia UCG 014 and “Lachnoclostridium” (Lachnospiraceae), and 8 taxa were enriched in the subgroup with lower GLUmax, including Methanosphaera (Methanobacteria), Lachnospira eligens (basonym: Eubacterium eligens) (Lachnospiraceae), Butyricicoccus faecihominis (Oscillospiraceae), Intestinibacter bartlettii (Clostridia, Peptostreptococcaceae), Sellimonas (Lachnospiraceae), E. tayi (Lachnospiraceae), Christensenellaceae R-7 group (Clostridia) (Fig. 3).
Predicting individual postprandial responses
We assessed the overall extent to which different combinations of input variables predict personal postprandial responses: iAUC120 and GLUmax. A total of 750 days of concurrent CGM usage and meal logging resulted in 3,514 meals to be analyzed with their PPGRs. Meal filtering (see RESEARCH DESIGN AND METHODS, Meal preprocessing) reduced the dataset to 2,706 meals. After removal of outliers in the target variable, the final dataset comprised 2,633 meals with PPGRs for GLUmax prediction model and 2,628 meals for iAUC120 prediction. Prediction models for both indices were developed utilizing gradient boosting algorithms, with the following combinations of input variables: 1) only carbohydrate content of the meal (carbs); 2) clinically available parameters (anthropometric, biochemical, lifestyle questionnaire, meal content and meal context, CGM data); 3) model 2 parameters + microbial features (the full model). For the full list of features please see the Supplementary Table 1. Validation of the model was performed using a three-fold cross-validation scheme (see RESEARCH DESIGN AND METHODS).
In the context of predicting GLUmax, the first model that relied solely on the amount of carbohydrates in a meal demonstrated the lowest correlation with PPGRs (R = 0.35) and accounted for only 5% of the variation in glycemic response (Fig. 4A). The second model based on clinically available parameters achieved a significantly higher correlation (R = 0.62) and explained 34 % of variance (Fig. 4B). Adding microbiome features (Fig. 4C) further increased the predictive ability with an R of 0.66 and a coefficient of determination of 42%.
Likewise, in the prediction of iAUC120, a model based solely on the carbohydrate content of meals demonstrated a relatively weak correlation (R = 0.51) and explained only 26% of the variation in glycemic response (Fig. 5A). The addition of parameter groups, as described above, resulted in an increase in correlation between CGM-measured and predicted values (R = 0.71, R2 = 0.50). Addition of microbial features to this model slightly increased the accuracy of prediction (R = 0.72, R2 = 0.52) (Fig. 5B-5C).
Because the performance of a model can also be affected by non-linear relationships between measured and predicted values, we also assessed MAE, MSE and RMSE for the models with higher performance (models 2-3, Table 2). As shown in Table 2, adding microbial features decreased MAE, MSE and RMSE for GLUmax prediction, but did not influence these parameters characterizing prediction of iAUC120.
Exploring factors influencing the prediction of postprandial glycemic responses
Following the examination of different models predicting PPGRs, our subsequent focus was on understanding the individual factors influencing prediction accuracy, including microbial features and other parameters comprising the full model. For this purpose, we conducted feature attribution analysis employing SHAP [19].
The features that exerted the greatest influence on iAUC120 prediction, as indicated by the highest mean absolute SHAP value, encompassed the carbohydrate content of the meal, glycemic load of the meal, amount of starch in the meal, and CGM-derived parameters characterizing glucose levels preceding the meal (glucose level 10 minutes before meal and glucose rise from 240 minutes before the meal to meal start) (Fig. 6A). The most influential parameters for the prediction of GLUmax were the glucose levels at the onset of the meal (GLU0), the carbohydrate content of the meal, glycemic load of the meal, RA of I. bartlettii, and the amount of protein consumed up to 6 hours before the meal (Fig. 6B).
Among the 20 most influential parameters for the prediction of iAUC120 or GLUmax, the algorithm selected the RA of the following bacterial taxa: I. bartlettii, “L. edouardi”, B. faecihominis (for iAUC120), and I. bartlettii,L. eligens (basonym: Eubacterium eligens), and R. champanellensis (for GLUmax) (Fig. 6 A,B). Notably, I. bartlettii ranked fourth among influential parameters for the prediction of GLUmax and was selected by the algorithm among the top parameters both for iAUC120 and for GLUmax prediction.
In order to assess the cumulative influence of microbial composition and other feature groups on the model, we summed the SHAP values of associated features (Fig. 7). These examinations revealed that the meal composition had the most significant effect on prediction of iAUC120, followed by CGM-derived data, meal context, and microbial composition (Fig. 8). Оn the contrary, for the prediction of GLUmax the main predictor group was the CGM-derived data, followed by meal composition, meal context, and microbial data also taking the fourth place (Fig. 7).