The findings of this study underscore the potential of ML models in predicting GWG, thereby providing a novel approach to enhancing prenatal care. The LightGBM and Random Forest models, in particular, exhibited strong predictive capabilities, with LightGBM achieving the highest AUC-ROC values across all GWG categories. These results align with existing literature that highlights the superiority of gradient boosting algorithms for handling complex, non-linear relationships in large datasets. Recent studies indicate that boosting algorithms represent the state-of-the-art for tabular data demonstrating high performance across a wide range of tasks, including classification [35, 36].
The significant predictors identified in this study, such as: pre-gestational BMI, maternal age, glycemic profile, hemoglobin levels, and arm circumference are consistent with known risk factors for GWG. These predictors collectively capture the multifaceted influences on GWG, encompassing physiological, demographic, and lifestyle dimensions. Importantly, these predictors are relatively easy to collect, even in remote or resource-limited settings, enhancing the feasibility of deploying these ML models in diverse clinical environments. The inclusion of these predictors enhances the model's ability to accurately stratify women based on their risk of inadequate or excessive GWG, thereby facilitating targeted interventions.
Regarding the prediction of GWG using ML algorithms, the literature indicates that GWG is minimal during the first trimester due to initial physiological and hormonal changes. Most women gain little weight during this period. In the second trimester, weight gain begins to accelerate, offering an ideal window for monitoring and intervention[37, 38]. Early interventions for controlling GWG are more effective when initiated in the second trimester, as there is still sufficient time to implement lifestyle changes that can positively influence weight gain in the third trimester, where the gain is more pronounced. This aligns with studies suggesting that timely interventions can significantly impact pregnancy outcomes [1, 15, 16, 39].
Our results are consistent with other studies that have utilized machine learning to predict perinatal outcomes. For example, a study by Lee and Ahn (2020) demonstrated the effectiveness of ML models in predicting preterm birth, highlighting the importance of early and accurate predictions for timely intervention. Similarly, Ramakrishnan, Rao, and He (2021) emphasized the potential of ML in identifying high-risk pregnancies and improving maternal-fetal health outcomes through early detection and personalized care.
The ease of collecting the significant predictors identified in this study makes these models particularly valuable for deployment in remote and resource-limited areas. In such settings, where access to advanced medical infrastructure may be limited, the ability to gather basic anthropometric and clinical data can still enable effective risk stratification and intervention.
Despite the promising results, several limitations must be acknowledged. The study's cohort is limited to a single geographic region (Araraquara, Brazil), which may affect the generalizability of the findings. Future research should aim to validate these models in diverse populations to ensure broader applicability.
The performance metrics, while robust, also indicate areas for improvement. For instance, the AUC-ROC values for predicting GWG above recommendations were lower compared to the other categories, suggesting a need for further refinement of the models to enhance their sensitivity to this particular outcome. Incorporating additional predictors, such as genetic factors or more detailed dietary intake information, could potentially improve predictive accuracy.
Furthermore, integrating these ML models into clinical practice requires careful consideration of practical and ethical implications. Clinicians must be adequately trained to interpret and act on model predictions, and safeguards should be in place to ensure data privacy and security. The development of user-friendly interfaces and decision-support systems will be essential for the seamless integration of these tools into routine prenatal care.
This study highlights the feasibility and utility of ML models in predicting GWG, offering a valuable tool for early identification and management of at-risk pregnancies. By leveraging advanced analytics, healthcare providers can deliver more personalized and effective prenatal care, ultimately contributing to better health outcomes for mothers and their babies. Future research and clinical efforts should focus on refining these models, validating their applicability in diverse settings, and addressing the practical challenges associated with their implementation. The timely prediction and intervention, particularly starting in the second trimester, could significantly enhance pregnancy management and outcomes, supporting the findings of previous research on the importance of early GWG control [40, 41]. The ease of data collection for key predictors makes these models especially valuable for deployment in remote areas, broadening the impact and accessibility of advanced prenatal care solutions.