Predictive Modeling of Gestational Weight Gain: A Machine Learning Multiclass Classification Study

doi:10.21203/rs.3.rs-4487465/v1

Download PDF

Research Article

Predictive Modeling of Gestational Weight Gain: A Machine Learning Multiclass Classification Study

https://doi.org/10.21203/rs.3.rs-4487465/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background

Gestational weight gain (GWG) is a critical factor influencing maternal and fetal health. Excessive or insufficient GWG can lead to various complications, including gestational diabetes, hypertension, cesarean deliver, low birth weight, and preterm birth. This study aims to develop and evaluate machine learning models to predict GWG categories (below, within, or above recommended guidelines)

Methods

We analyzed data from the Araraquara Cohort, Brazil comprising 1557 pregnant women with a gestational age of 19 weeks or less. Predictors included socioeconomic, demographic, lifestyle, morbidity, and anthropometric factors. Five machine learning algorithms (Random Forest, LightGBM, AdaBoost, CatBoost, and XGBoost) were employed for model development. The models were trained and evaluated using a multiclass classification approach. Model performance was assessed using metrics such as area under the ROC curve (AUC-ROC), F1 score and Matthews correlation coefficient (MCC).

Results

The outcome were categorized as follows: GWG within recommendations (28.7%), GWG below (32.5%), and GWG above recommendations (38.7%). The LightGBM model presented the best overall performance with an AUC-ROC of 0.79 for predicting GWG within recommendations, 0.756 for GWG below recommendations, and 0.624 for GWG above recommendations. The Random Forest model also performed well, achieving an AUC-ROC of 0.774 for GWG within recommendations, 0.732 for GWG below recommendations, and 0.593 for GWG above recommendations. The most importante were predictors of GWG were pre-gestational BMI, maternal age, glycemic profile, hemoglobin levels, and arm circumference.

Conclusion

Machine learning models can effectively predict GWG categories, providing a valuable tool for early identification of at-risk pregnancies. This approach can enhance personalized prenatal care and interventions to promote optimal pregnancy outcomes.

Gestational weight gain

machine learning

prediction models

maternal health

fetal health

Araraquara Cohort

Gestational Weight Gain (GWG) is a critical factor that directly influences maternal and fetal health. Inadequate weight gain during pregnancy, whether excessive or insufficient, is associated with various complications such as gestational diabetes, hypertension, cesarean deliver, low birth weight, and preterm birth [1–3]. Recent data indicate that a significant proportion of pregnant women do not meet the recommended parameters set by the Institute of Medicine (IOM), highlighting the need for personalized and early interventions to improve pregnancy outcomes [4, 5].

Machine learning (ML), a subfield of artificial intelligence (AI), offers new opportunities for analyzing large volumes of data (big data) and identifying complex patterns that traditional statistical methods may not capture [6–8]. The application of ML techniques in public health has rapidly expanded, providing powerful tools for prediction, diagnosis, and monitoring of health conditions [9–12]. In the context of perinatal health, accurate prediction of GWG can enable the early identification of at-risk pregnant women and the implementation of targeted interventions.

Previous studies have demonstrated a direct relationship between GWG and various maternal and infant health outcomes, mediated by factors such as pre-pregnancy body mass index (BMI), maternal age, sociodemographic conditions, and race [13, 14]. Excessive weight gain is a risk factor for gestational complications, such as gestational diabetes and hypertension, and it is also associated with metabolic and cardiometabolic diseases in childhood [15, 16]. On the other hand, insufficient weight gain is related to low birth weight, preterm birth, and perinatal mortality [17–19]. Despite the promising potential of ML, the literature remains scarce in basic or translational research that uses AI to predict maternal and infant outcomes, especially in low-income regions and with limited sample sizes [11, 20–22]. This study aims to fill this gap by applying advanced ML techniques to predict categories of GWG. The objective of this study is to identify women at higher risk of inadequate weight gain during pregnancy, enabling preventive interventions that promote healthy pregnancy outcomes. Using longitudinal data from the Araraquara Cohort, we tested and compared the performance of ML algorithms in a multiclass classification approach. Our results aim to contribute to the improvement of personalized prenatal care and the reduction of disparities in maternal and infant health outcomes.

Dataset Description

We analyzed data from a population-based cohort study conducted in Araraquara, São Paulo, Brazil (Araraquara Cohort). The sample included women with a gestational age less than or equal to 19 weeks, who received prenatal care at Basic Health Units in Araraquara. Participants were followed quarterly throughout prenatal care until the birth of their children from 2017 to 2022. Excluded from the study were women with twin pregnancies and those who had a pre-viable abortion. In cases of fetal death and stillbirths, only pregnancy data were considered.

Several characteristics were considered for predicting GWG as shown in Fig. 1. Socioeconomic and demographic factors included age (≤ 19, 20–35, or > 35 years), educational level (< 4, 5–11, or ≥ 12 years of schooling), per capita income in Brazilian reais (1 US$ = 4.9 R$), race (white or non-white), marital status (married/stable union or single/separated/widowed), and the number of previous pregnancies (0, 1, or ≥ 2). Lifestyle factors included physical activity, smoking, and alcohol consumption. Morbidity factors included diabetes, hypertension, urinary tract infection, and cervicitis/vaginitis.

Anthropometric data of the pregnant women were evaluated based on height (cm) categorized into tertiles; BMI (kg/m²); arm circumference (cm); and body fat percentage. Other relevant data included gestational age at birth, glycemic profile (fasting glucose [mg/dL], insulin [µUI/mL], HOMA [µUI/mL], glycated hemoglobin [%]), high-sensitivity C-reactive protein (hs-CRP [ng/mL]), hemoglobin [g/dL], and lipid profile (total cholesterol, LDL-c, HDL-c, and triglycerides [mg/dL]). Additionally, the number of family members per room was categorized into tertiles, and the number of previous pregnancies was categorized as 0, 1, and ≥ 2.

Outcome definition

GWG was calculated as the difference between weight at delivery and pre-pregnancy weight. GWG was then classified into three categories according to the recommendations of the Institute of Medicine (IOM): (a) GWG below IOM recommendations, (b) GWG within IOM recommendations, and (c) GWG above IOM recommendations [5]

Statistical analysis

Descriptive statistics were used to summarize the characteristics of the study population. Continuous Predictors were presented as median and interquartile range (IQR), while categorical Predictors were presented as frequencies and percentages. Differences between GWG categories were tested using the Kruskal-Wallis test for continuous Predictors and the Chi-square test or Fisher’s exact test for categorical Predictors, as shown in Table 1.

Table 1

Maternal characteristics associated with GWG according to IOM recommendations.
Predictors		Gestational Weight Gain (IOM-2019)			P value
	Overall	Within	Below	Above
	1557	447(28.7)	506(32.5)	604(38.7)
Age (years)
≤ 19	154(9.9)	47(3.02)	51(3.28)	56(3.6)	0.531
20–35	1189(76.4)	346(22.22)	389(25)	454(29.16)
> 35	214(13.7)	54(3.47)	66(4.24)	94(6.04)
Height(cm)
1º tercil	534(34.34)	167(10.73)	187(12.03)	180(11.57)	0.003
2º tercil	505(32.48)	146(9.39)	170(10.93)	189(12.15)
3º tercil	516(33.18)	134(8.62)	147(9.45)	235(15.11)
Pre-gestational BMI (kg /m²)	25.6(22.2–30.2)	25(21.3–28.6)	24.8(21.8–30.2)	26.8(23.2–31.2)	< 0.001
Arm circumference(cm)
< 23	67(4.37)	23(1.50)	29(1.90	15(0.89)	< 0.001
23–28	474(31)	147(9.61)	190(12.42)	137(8.95)
> 28	989(64.64	264(17.25)	283(18.50)	442(28.89)
Body fat (%)	33.3(28.3–37.8)	32.3(26.9–36.6)	32.3(26.6–37)	34.7(30.3–39.1)	< 0.001
Gestational age (weeks)	39.4(38.5–40.3)	39.4(38.7–40.3)	39.2(38.1–40.1)	39.7(38.9–40.4)
Maternal education (years)
≤ 4	10(0.6)	1(0.06)	5(0.32)	4(0.26)	< 0.001
5–11	1181(75.9)	342(21.97)	389(24.98)	450(28.9)
≥ 12	365(23.5)	104(6.68)	111(7.13)	150(9.63)
Per capita income (R$)	666.7(400–1000)	665.9(400–970)	600(382.4–1000)	668(466.6–1000)	0.002
Race
White	722(46.3)	208(13.36)	223(14.32)	291(18.69)	0.392
Non-white	835(53.6)	239(15.35)	283(18.18)	313(20.1)	0.392
Marital status
Married or in a stable relationship	1359(87.3)	388(24.93)	441(28.32)	530(34.04)	0.896
Single, separated, or widowed	198(12.7)	59(3.79)	65(4.17)	74(4.75)	0.896
Physical activity
Adequate	175(11.2)	50(3.21)	59(3.794)	66(4.24)	0.951
Inadequate	524(33.7)	156(10.02)	172(11.05)	196(12.59)	0.951
Smoking
No	1434(92.1)	409(26.27)	449(28.84)	576(36.99)	< 0.001
Yes	123(7.9)	38(2.44)	57(3.66)	28(1.8)	< 0.001
Alcohol consumption
No	1238(79.5)	353(22.67)	401(25.75)	482(30.96)	0.885
Yes	319(20.5)	94(6.04)	105(6.74)	120(7.71)	0.885
Diabetes
No	1479(95,0)	429(27.55)	459(29.48)	591(37.96)	< 0.001
Yes	78(5)	18(1.16)	47(3.02)	13(0.83)	< 0.001
Hypertension
No	1448(93)	420(26.97)	470(30.19)	558(35.84)	0.608
Yes	109(7)	27(1.73)	36(2.31)	46(2.95)	0.608
hs-CRP (ng/mL)	5.9(3.1–11.7)	5.1(3–10)	6.1(3.2–11.9)	6.5(3.0-12.6)	0.137
HOMA (uUI/mL)	1.36(0.9–2.1)	1.4(0.9–2.1)	1.3(0.99–2.1)	1.42(1-2.2)	0.094
Hemoglobin (g/dL)	12.5(12-13.1)	12.6(11.9–13.1)	12.4(11.8–13)	12.6(121 − 13.2)	0.002
Glycated hemoglobin %,	5.1(4.9–5.3)	5.1(4.9–5.3)	5.1(4.9–5.3)	5(4.8–5.3)	0.059
Cholesterol (mg/dL)	173(151–196)	172(152–196)	172(149–194)	174(152–198)	0.526
HDL-c (mg/dL)	56(48–64)	56(49–64)	55(47–62)	56(49–65)	0.012
LDL-c(mg/dL)	95(77–113)	94(79–111)	94(76–112)	96(77–115)	0.639
Triglycerides (mg/dL)	104(81–133)	104(80–134)	106(85–137)	100(80–129)	0.13
Data are presented as number (percentage) and median and interquartile range (percentile 25 - percentile 75).
Statistical differences among gestational weight gain groups were tested with: Kruskal-Wallis test for continuous predictors and χ2 test, Fisher's test for categorical predictors.

Machine learning model design

Considering the different outcomes related to GWG, we employed a multiclass classification approach to evaluate whether changing strategies could enhance model performance. Separate models were developed for each GWG category: below IOM recommendations, within IOM recommendations, and above IOM recommendations. The models were evaluated independently without sharing any information during the process, as shown in Fig. 1.

Machine learning techniques

Data preprocessing

For quantitative predictors, standardization was performed using the z-score, separately in the training and test sets. All qualitative predictors were handled through one-hot encoding, where each category was considered separately for this procedure. Additionally, predictors with a percentage of missing values above 20% were removed, while those with less than 20% missing values were imputed using the mean, as recommended by previous studies in healthcare [23, 24].

Algorithm Selection

We tested five different ML algorithms: CatBoost [25], XGBoost [26], LightGBM [27], and Random Forest. For CatBoost, XGBoost, and LightGBM, we used their respective Python packages. For the other algorithms, we used the scikit-learn library [28].

Hyperparameter selection

Hyperparameter selection in the training set was performed through 10-fold cross-validation, using Bayesian optimization and RandomSearch strategies [29]. In cases of significant class imbalance, where the minority class represented less than 25% of the total outcomes, the Synthetic Minority Over-sampling Technique (SMOTE) was applied. Additionally, in the training set, the BORUTA method was employed for feature selection [30].

Model evaluation

The models with the best performance in the training set (which corresponded to 70% of the data) were selected for evaluation in the test set (30%). The evaluation of machine learning algorithms was conducted in the test set, based on metrics such as area under the ROC curve (AUC-ROC), area under the precision-recall curve (AUC-PR), precision, recall, positive predictive value, negative predictive value, Matthew's correlation coefficient (MCC) and F1 score. Finally, the interpretation and evaluation of each predictors contribution to the outcome were obtained through the calculation of Shapley values [31–33] in the test set. We adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines [34].

Maternal characteristics and GWG

The study included 1557 pregnant women, with 28.7% having GWG within the Institute of Medicine (IOM) recommendations, 32.5% below the recommendations, and 38.7% above the recommendations, as shown in Fig. 2. The majority of the women were aged between 20–35 years (76.4%) and were predominantly non-white (53.6%). Key characteristics associated with GWG categories included pre-gestational BMI, maternal age, glycemic profile, hemoglobin levels, and arm circumference. The prevalence of diabetes and hypertension was significantly higher among women with GWG above the recommendations (P < 0.001) (Table 1).

Performance of predictive models

The LightGBM model demonstrated the best overall performance with an AUC-ROC of 0.79 for predicting GWG within recommendations, 0.756 for below recommendations, and 0.624 for above recommendations. The Random Forest model also performed well, with an AUC-ROC of 0.774 for within recommendations, 0.732 for below recommendations, and 0.593 for above recommendations. Other algorithms, such as CatBoost, XGBoost, AdaBoost, and Logistic Regression, were also evaluated. (Table 2, Figs. 3, 4 and 5)

Table 2

Predictive performance on test data of the best algorithm for each outcome with hyperparameter tuning
Model + Class	Hyperparameter Tuning	AUC-ROC	Acuracy	Recall	Specificity	Precision	F1	MCC
LightGBM (GWG Within)	{'num_leaves': 31, 'learning_rate': 0.1}	0.79	0.75	0.58	0.83	0.62	0.60	0.42
XGBoost (GWG Within)	{'n_estimators': 200, 'max_depth': 3, 'learning_rate': 0.1}	0.79	0.74	0.59	0.82	0.60	0.60	0.41
Random Forest (GWG Within)	{'n_estimators': 100, 'max_depth': 10}	0.77	0.76	0.58	0.82	0.60	0.59	0.41
CatBoost (GWG Within)	{'learning_rate': 0.1, 'iterations': 100, 'depth': 6}	0.77	0.75	0.60	0.82	0.61	0.61	0.42
LightGBM (GWG Below)	{'num_leaves': 31, 'learning_rate': 0.1}	0.76	0.68	0.74	0.64	0.57	0.64	0.37
XGBoost (GWG Below)	{'n_estimators': 200, 'max_depth': 3, 'learning_rate': 0.1}	0.76	0.68	0.69	0.68	0.58	0.63	0.36
CatBoost (GWG Below)	{'learning_rate': 0.1, 'iterations': 100, 'depth': 6}	0.75	0.68	0.69	0.67	0.58	0.63	0.36
Random Forest (GWG Below)	{'n_estimators': 100, 'max_depth': 10}	0.73	0.68	0.77	0.59	0.55	0.64	0.35
AdaBoost (GWG Below)	{'n_estimators': 200, 'learning_rate': 0.1}	0.71	0.66	0.77	0.60	0.55	0.64	0.36
AdaBoost (GWG Within)	{'n_estimators': 200, 'learning_rate': 0.1}	0.71	0.72	0.55	0.80	0.57	0.56	0.35
CatBoost (GWG Above)	{'learning_rate': 0.1, 'iterations': 100, 'depth': 6}	0.61	0.69	0.36	0.85	0.49	0.42	0.23
XGBoost (GWG Above)	{'n_estimators': 200, 'max_depth': 3, 'learning_rate': 0.1}	0.65	0.65	0.36	0.84	0.47	0.40	0.21
LightGBM (GWG Above)	{'num_leaves': 31, 'learning_rate': 0.1}	0.62	0.62	0.30	0.85	0.44	0.36	0.17
AdaBoost (GWG Above)	{'n_estimators': 200, 'learning_rate': 0.1}	0.57	0.57	0.22	0.89	0.43	0.29	0.13
Random Forest (GWG Above)	{'n_estimators': 100, 'max_depth': 10}	0.60	0.59	0.24	0.89	0.47	0.32	0.17

SHAP values and predictor importance

The use of SHAP values provided insight into the importance of various predictors for each GWG category. Pre-gestational BMI, maternal age, glycemic profile, hemoglobin levels, and arm circumference were identified as the most significant predictors. These variables were crucial in determining the likelihood of a pregnant woman falling into one of the GWG categories (below, within, or above IOM recommendations (Fig. 6).

The findings of this study underscore the potential of ML models in predicting GWG, thereby providing a novel approach to enhancing prenatal care. The LightGBM and Random Forest models, in particular, exhibited strong predictive capabilities, with LightGBM achieving the highest AUC-ROC values across all GWG categories. These results align with existing literature that highlights the superiority of gradient boosting algorithms for handling complex, non-linear relationships in large datasets. Recent studies indicate that boosting algorithms represent the state-of-the-art for tabular data demonstrating high performance across a wide range of tasks, including classification [35, 36].

The significant predictors identified in this study, such as: pre-gestational BMI, maternal age, glycemic profile, hemoglobin levels, and arm circumference are consistent with known risk factors for GWG. These predictors collectively capture the multifaceted influences on GWG, encompassing physiological, demographic, and lifestyle dimensions. Importantly, these predictors are relatively easy to collect, even in remote or resource-limited settings, enhancing the feasibility of deploying these ML models in diverse clinical environments. The inclusion of these predictors enhances the model's ability to accurately stratify women based on their risk of inadequate or excessive GWG, thereby facilitating targeted interventions.

Regarding the prediction of GWG using ML algorithms, the literature indicates that GWG is minimal during the first trimester due to initial physiological and hormonal changes. Most women gain little weight during this period. In the second trimester, weight gain begins to accelerate, offering an ideal window for monitoring and intervention[37, 38]. Early interventions for controlling GWG are more effective when initiated in the second trimester, as there is still sufficient time to implement lifestyle changes that can positively influence weight gain in the third trimester, where the gain is more pronounced. This aligns with studies suggesting that timely interventions can significantly impact pregnancy outcomes [1, 15, 16, 39].

Our results are consistent with other studies that have utilized machine learning to predict perinatal outcomes. For example, a study by Lee and Ahn (2020) demonstrated the effectiveness of ML models in predicting preterm birth, highlighting the importance of early and accurate predictions for timely intervention. Similarly, Ramakrishnan, Rao, and He (2021) emphasized the potential of ML in identifying high-risk pregnancies and improving maternal-fetal health outcomes through early detection and personalized care.

The ease of collecting the significant predictors identified in this study makes these models particularly valuable for deployment in remote and resource-limited areas. In such settings, where access to advanced medical infrastructure may be limited, the ability to gather basic anthropometric and clinical data can still enable effective risk stratification and intervention.

Despite the promising results, several limitations must be acknowledged. The study's cohort is limited to a single geographic region (Araraquara, Brazil), which may affect the generalizability of the findings. Future research should aim to validate these models in diverse populations to ensure broader applicability.

The performance metrics, while robust, also indicate areas for improvement. For instance, the AUC-ROC values for predicting GWG above recommendations were lower compared to the other categories, suggesting a need for further refinement of the models to enhance their sensitivity to this particular outcome. Incorporating additional predictors, such as genetic factors or more detailed dietary intake information, could potentially improve predictive accuracy.

Furthermore, integrating these ML models into clinical practice requires careful consideration of practical and ethical implications. Clinicians must be adequately trained to interpret and act on model predictions, and safeguards should be in place to ensure data privacy and security. The development of user-friendly interfaces and decision-support systems will be essential for the seamless integration of these tools into routine prenatal care.

This study highlights the feasibility and utility of ML models in predicting GWG, offering a valuable tool for early identification and management of at-risk pregnancies. By leveraging advanced analytics, healthcare providers can deliver more personalized and effective prenatal care, ultimately contributing to better health outcomes for mothers and their babies. Future research and clinical efforts should focus on refining these models, validating their applicability in diverse settings, and addressing the practical challenges associated with their implementation. The timely prediction and intervention, particularly starting in the second trimester, could significantly enhance pregnancy management and outcomes, supporting the findings of previous research on the importance of early GWG control [40, 41]. The ease of data collection for key predictors makes these models especially valuable for deployment in remote areas, broadening the impact and accessibility of advanced prenatal care solutions.

This study demonstrates that ML models, particularly LightGBM and Random Forest, can effectively predict GWG categories. Our models achieved notable performance, with the LightGBM model showing the highest AUC-ROC values for predicting GWG within, below, and above the IOM.

The application of these predictive models in clinical practice can enable early identification of pregnant women at risk of excessive or inadequate weight gain. This early detection facilitates timely and personalized interventions, potentially reducing the risk of associated maternal and fetal complications. By integrating machine learning into prenatal care, healthcare providers can improve the monitoring and management of GWG, ultimately enhancing maternal and fetal health outcomes

DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation. The code developed for constructing the algorithms along with the original dataset, is available on Github ( https://github.com/Audency/Prediction-of-Gestational-Weight-Gain-for-Pregnancy.git)

ETHICS STATEMENT

This study was approved by the Research Ethics Committee of the School of Public Health, University of São Paulo (USP), prior to data collection, under CAEE number 59787216.2.0000.5421, opinion number 1.885.874. All participants provided informed consent, consistent with the principles outlined in the Helsinki Declaration.

AUTHOR CONTRIBUTIONS

AV: conceptualization, methodology. AV, FBF, GFS, AFC, and ADPC: investigation. AV, LAL, FF: Data analysis, AV, HG, FF, and GFS: visualization and writing - original draft. HG, ADPC, and PHCR: supervision, writing, review, and editing. All authors contributed to the article and approved the submitted version.

FUNDING

AV has received a scholarship from São Paulo Research Foundation- FAPESP (grant number 2023/07936-3). This study was supported by the São Paulo Research Foundation (FAPESP) (grant number 2015/03333-6)

CONFLICTS OF INTEREST

The authors declare that they have no conflicts of interest to disclose.

ACKNOWLEDGMENTS

We especially thank the professionals, undergraduate, and graduate students who collaborated in the data collection for the Araraquara cohort. The authors would like to thank the São Paulo Research Foundation (FAPESP) for financial support (grant number 2015/03333-6) and also for the principal author's scholarship (grant number 2023/07936-3).

Victor A, de França da Silva Teles L, Aires IO, de Carvalho LF, Luzia LA, Artes R, et al. The impact of gestational weight gain on fetal and neonatal outcomes: the Araraquara Cohort Study. BMC Pregnancy Childbirth. 2024;24:320.
Goldstein RF, Abell SK, Ranasinha S, Misso ML, Boyle JA, Harrison CL, et al. Gestational weight gain across continents and ethnicity: systematic review and meta-analysis of maternal and infant outcomes in more than one million women. BMC Med. 2018;16:153.
Macdonald-Wallis C, Tilling K, Fraser A, Nelson SM, Lawlor DA. Gestational weight gain as a risk factor for hypertensive disorders of pregnancy. Am J Obstet Gynecol. 2013;209:327.e1-327.e17.
Martínez-Hortelano JA, Cavero-Redondo I, Álvarez-Bueno C, Garrido-Miguel M, Soriano-Cano A, Martínez-Vizcaíno V. Monitoring gestational weight gain and prepregnancy BMI using the 2009 IOM guidelines in the global population: a systematic review and meta-analysis. BMC Pregnancy Childbirth. 2020;20:649.
IOM. Weight gain during pregnancy: Reexamining the guidelines. Washington, D.C.: National Academies Press; 2009.
Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science (1979). 2015;349:255–60.
Lee K-S, Ahn KH. Application of Artificial Intelligence in Early Diagnosis of Spontaneous Preterm Labor and Birth. Diagnostics (Basel). 2020;10.
Ayodele TO. Machine learning overview. New Advances in Machine Learning. 2010;2:9–18.
Ramakrishnan R, Rao S, He J-R. Perinatal health predictors using artificial intelligence: A review. Womens Health (Lond). 2021;17:17455065211046132.
Batista AFM, Diniz CSG, Bonilha EA, Kawachi I, Chiavegatto Filho ADP. Neonatal mortality prediction with routinely collected data: a machine learning approach. BMC Pediatr. 2021;21:322.
Arayeshgari M, Najafi-Ghobadi S, Tarhsaz H, Parami S, Tapak L. Machine Learning-based Classifiers for the Prediction of Low Birth Weight. Healthc Inform Res. 2023;29:54–63.
Raj Pandey S, Ma J, Lai C-H, Raj Regmi P. A supervised machine learning approach to generate the auto rule for clinical decision support system. Trends in Medicine. 2020;20.
Champion ML, Harper LM. Gestational Weight Gain: Update on Outcomes and Interventions. Curr Diab Rep. 2020;20:11.
Gesche J, Nilas L. Pregnancy outcome according to pre-pregnancy body mass index and gestational weight gain. International Journal of Gynecology & Obstetrics. 2015;129:240–3.
Ren M, Li H, Cai W, Niu X, Ji W, Zhang Z, et al. Excessive gestational weight gain in accordance with the IOM criteria and the risk of hypertensive disorders of pregnancy: a meta-analysis. BMC Pregnancy Childbirth. 2018;18:281.
Truong YN, Yee LM, Caughey AB, Cheng YW. Weight gain in pregnancy : does the Institute of Medicine have it right ? The American Journal of Obstetrics & Gynecology. 2015;212:362.e1-362.e8.
Davis RR, Hofferth SL, Shenassa ED. Gestational weight gain and risk of infant death in the United States. Am J Public Health. 2014;104 Suppl 1:S90-5.
Voerman E, Santos S, Inskip H, Amiano P, Barros H, Charles MA, et al. Association of Gestational Weight Gain With Adverse Maternal and Infant Outcomes. JAMA. 2019;321:1702–15.
Lipworth H, Barrett J, Murphy KE, Redelmeier D, Melamed N. Gestational weight gain in twin gestations and pregnancy outcomes: a systematic review and meta-analysis. BJOG. 2022;129:868–79.
Ranjbar A, Montazeri F, Farashah MV, Mehrnoush V, Darsareh F, Roozbeh N. Machine learning-based approach for predicting low birth weight. BMC Pregnancy Childbirth. 2023;23.
Naimi AI, Platt RW, Larkin JC. Machine Learning for Fetal Growth Prediction. Epidemiology. 2018;29:290–8.
Islam MN, Mustafina SN, Mahmud T, Khan NI. Machine learning to predict pregnancy outcomes: a systematic review, synthesizing framework and future research agenda. BMC Pregnancy Childbirth. 2022;22:348.
Kang H. The prevention and handling of the missing data. Korean Journal of Anesthesiology. 2013;64:402–6.
García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR. Pattern classification with missing data: A review. Neural Comput Appl. 2010;19:263–82.
Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. CatBoost: unbiased boosting with categorical features. Adv Neural Inf Process Syst. 2018;31.
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016. p. 785–94.
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017;30.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in Python. the Journal of machine Learning research. 2011;12:2825–30.
Bergstra J, Yamins D, Cox D. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: International conference on machine learning. PMLR; 2013. p. 115–23.
Kursa MB, Rudnicki WR. Feature selection with the Boruta package. J Stat Softw. 2010;36:1–13.
Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2:56–67.
Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30.
Rodríguez-Pérez R, Bajorath J. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. J Comput Aided Mol Des. 2020;34:1013–26.
Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1–73.
Shwartz-Ziv R, Armon A. Tabular data: Deep learning is not all you need. Information Fusion. 2022;81:84–90.
Borisov V, Leemann T, Seßler K, Haug J, Pawelczyk M, Kasneci G. Deep Neural Networks and Tabular Data: A Survey. IEEE Trans Neural Netw Learn Syst. 2022;:1–21.
Yang J, Wang M, Tobias DK, Rich-Edwards JW, Darling AM, Abioye AI, et al. Gestational weight gain during the second and third trimesters and adverse pregnancy outcomes, results from a prospective pregnancy cohort in urban Tanzania. Reprod Health. 2022;19:140.
Wei X, Shen S, Huang P, Xiao X, Lin S, Zhang L, et al. Gestational weight gain rates in the first and second trimesters are associated with small for gestational age among underweight women: a prospective birth cohort study. BMC Pregnancy Childbirth. 2022;22:106.
Goldstein RF, Abell SK, Ranasinha S, Misso M, Boyle JA, Black MH, et al. Association of Gestational Weight Gain With Maternal and Infant Outcomes. JAMA. 2017;317:2207.
Kominiarek MA, O’Dwyer LC, Simon MA, Plunkett BA. Targeting obstetric providers in interventions for obesity and gestational weight gain: A systematic review. PLoS One. 2018;13:e0205268.
Ren P, Yang XJ, Railton R, Jendza J, Anil L, Baidoo SK. Effects of different levels of feed intake during four short periods of gestation and housing systems on sows and litter performance. Anim Reprod Sci. 2018;188:21–34.

No competing interests reported.

SuplemmentaryMaterial.docx

Download PDF

Editorial decision: Revision requested
12 Jun, 2024
Editor assigned by journal
06 Jun, 2024
Submission checks completed at journal
06 Jun, 2024
First submitted to journal
27 May, 2024

You are reading this latest preprint version

Predictive Modeling of Gestational Weight Gain: A Machine Learning Multiclass Classification Study

Status:

Version 1

Abstract

Background

Methods

Results

Conclusion

Figures

Introduction

Materials and methods

Dataset Description

Outcome definition

Statistical analysis

Machine learning model design

Machine learning techniques

Data preprocessing

Algorithm Selection

Hyperparameter selection

Model evaluation

Results

Maternal characteristics and GWG

Performance of predictive models

SHAP values and predictor importance

Discussion

Conclusion

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1