Data Source and Study Population
In this study, 1681 pregnant women were recruited during their initial prenatal visit at the General Hospital of Ningxia Medical University, spanning from February 2022 to April 2023. Upon visiting the hospital's obstetric clinic for antenatal checkups, all participants were administered a comprehensive set of questionnaires. The standardized questionnaires were used to gather data on sociodemographic factors, behavioral habits, pregnancy conditions, and clinical information. The survey encompassed general demographic characteristics, the psychological state of the participants during pregnancy, and their sleep quality. All participants provided written informed consent, and the study protocol was approved by the Ethics Committee of Ningxia Medical University.
The inclusion criteria for the study were as follows: (1) age 18 years or older; (2) at least 8 weeks pregnant at the time of enrollment; (3) complete participant sleep information available; (4) no history of psychiatric illness and the ability to understand and complete questionnaires. The exclusion criteria included: (1) diagnosed with a sleep disorder; (2) women with acute or critical illness and severe pregnancy complications; (3) severe infections during pregnancy, fetal malformations; (4) missing information on sleep or other critical variables. Initially, a total of 1846 women were surveyed. Ultimately, the study encompassed 1681 women who completed questionnaires assessing sleep quality, stress, and depressive status during pregnancy.
Sleep Quality Assessment
Sleep quality was evaluated using the Pittsburgh Sleep Quality Index (PSQI), a validated instrument for measuring sleep quality in Chinese pregnant women[22, 23]. The PSQI comprises a 19-item self-rated questionnaire that gauges sleep quality over the preceding month. Scores on the PSQI range from 0 to 21, with higher scores signifying poorer sleep quality. Sleep disorders were defined as a sum score of ≥ 5 by previous studies[22].
Feature Data Collection for Pregnancy Sleep Disorders
The levels of anxiety and depression were evaluated using the Pregnancy-Related Anxiety Questionnaire (PRAQ) and the Edinburgh Postnatal Depression Scale (EPDS). When the PRAQ overall score was ≥ 24, women were classified as experiencing pregnancy-related anxiety[24]. The cutoff point of EPDS standardized score ≥ 9 reflects depressive symptomatology[25].
Data on sociodemographics, pregnancy conditions, and clinical information were collected from pregnant women using standardized questionnaires. The questionnaires included participants’ age, education level, occupation, residence, pre-pregnancy BMI, gestational weeks for current pregnancy, personality traits, participant's only-child status, number of embryos, number of pregnancies, number of abortions, history of labor induction, method of conception, severity of morning sickness, pregnancy intention, mother's preference for fetus's sex, family's preference for fetus's sex, physical health before pregnancy, and underlying diseases.
To ensure data integrity, participants with more than 30% missing values in their data were excluded from the study. For indicators with missing values below 30%, the missing data were imputed using the median value of each respective category.
Feature Selection for Pregnancy Sleep Disorders
Following univariate regression analysis to identify significant predictors, these predictors were subsequently incorporated into a multivariate regression model. To further refine the feature selection, the Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied[26]. By integrating logistic regression with LASSO, we synergistically identified the most relevant features. This combined approach effectively harnessed the strengths of both methods to select a subset of features from the original dataset that are most predictively associated with pregnancy sleep disorders, which were then utilized for model training.
The Development of ML Models
A 5-fold cross-validation method was employed to partition the dataset, with four folds serving as the training sets for model development and one folds reserved as the test set to assess the model's predictive performance. Eight ML models were utilized to identify sleep disorders in pregnant women, including Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Naive Bayes (NB), Light Gradient Boosting Machine (LightGBM), Decision Tree (DT), and Artificial Neural Networks (ANNs). To enhance the performance of each ML algorithm, hyperparameters were fine-tuned using a grid search approach.
The Evaluation of ML Models
The model evaluation metrics encompassed the area under the receiver operating characteristic curve (AUC), accuracy, precision, sensitivity, specificity, and the F1 score. Following an assessment of the performance of each ML model, the algorithm that demonstrated the best overall performance in identifying sleep disorders during pregnancy was selected. The model was then interpreted using SHAP (SHapley Additive Explanations)[21], where the SHAP values were calculated to ascertain the significance of the features.
Statistical Analysis
Continuous variables were characterized using the mean ± standard deviation (SD), while categorical variables were expressed as frequencies and percentages (n%). Comparisons between variables were made using the Student’s t-test for continuous data and the Rao-Scott Pearson χ2 test for categorical data. Univariable and multivariable analyses were performed using logistic regression. Significant predictors identified in the univariable analysis were subsequently included in the multivariable logistic regression model. To mitigate the risk of overfitting, Lasso regression was applied to reduce the influence of multicollinearity and to enhance the model's robustness. The performance of the ML models was evaluated using metrics such as the area under the receiver operating characteristic curve (AUC), accuracy, precision, sensitivity, specificity, and the F1 score. All statistical analyses were conducted using R software, version 4.4.1, with packages including ggplot2, lattice, caret, tidyverse, skimr, pROC, randomForest, and xgboost. Statistical significance was set at a two-tailed p-value of 0.05 or less.