Research objects
We developed and validated the model using routinely collected retrospective health data of pregnant women with hypothyroidism who gave birth in a metropolitan tertiary teaching hospital in Wuxi, Jiangsu Province, China from October 1,2020 to December 31,2022.
Inclusion and exclusion criteria
Refer to the Chinese Guidelines for the Diagnosis and Treatment of Thyroid Diseases during Pregnancy and Postpartum (Second Edition)[12]. The diagnostic criteria for clinical hypothyroidism during pregnancy are : serum TSH > upper limit of pregnancy-specific reference range, serum FT4 < lower limit of pregnancy-specific reference range. If the specific reference range of TSH during pregnancy cannot be obtained, the cut-off value of the upper limit of TSH in early pregnancy can be obtained by the following two methods : the upper limit of TSH reference range in the general population decreases by 22% or 4.0 mU/L.
The diagnostic criteria for SCH during pregnancy: serum TSH > the upper limit of the pregnancy-specific reference range, and serum FT4 is within the pregnancy-specific reference range.
Maternal inclusion criteria:( 1) Age 18 to 45;(2) Pre-pregnancy or pregnancy was diagnosed with hypothyroidism during pregnancy, with complete case data.
Exclusion criteria : ( 1)Important organ dysfunction; (2) Other autoimmune diseases except autoimmune thyroiditis ; (3) Severe uterine or vaginal malformations.
Candidate predictors
We selected candidate predictors through relevant research[7, 13–17].We evaluated the following predictors for inclusion in the model:
General maternal information included age, pre-pregnancy BMI, gestational age, gravidity, parity, mode of conception, cholesterol, platelets, hemoglobin. Past medical history included thyroid dysfunction, hypertension, diabetes mellitus, anemia and group B streptococcal infection. Family medical history included thyroid dysfunction, diabetes mellitus, hypertension and anemia. Current medical history including gestational diabetes, gestational hypertension, eclampsia/preeclampsia, pregnancy complicated with group b streptococcal infection and perinatal anemia. Thyroid function and treatment during pregnancy included gestational age, TSH and FT4 levels at diagnosis, times of monitoring TSH during pregnancy, whether medication was used, whether treatment was given, whether TSH reached the standard and time of reaching the standard. Delivery methods include spontaneous delivery, lateral episiotomy, forceps delivery, emergency cesarean section/spontaneous delivery to cesarean section, and planned cesarean section. Newborn information including gender and weight.
Outcomes
The primary outcome was a composite of adverse maternal and infant outcomes.
Adverse maternal outcomes include: threatened abortion/preterm delivery, preterm delivery, abortion/stillbirth, premature rupture of membranes, placental abruption, placental dysfunction, polyhydramnios, oligohydramnios/oligohydramnios, fetal malformation, intrauterine growth restriction, intrauterine fetal distress, postpartum hemorrhage, postpartum fever, postpartum thyroid dysfunction, and number of hospitalizations.
Neonatal adverse outcomes included: low birth weight, macrosomia, low Apgar score (≤ 7 points at 1–5 min after birth), transfer to neonatal/intensive care unit (NICU), neonatal thyroid dysfunction, neonatal asphyxia, neonatal hyperbilirubinemia, neonatal pneumonia/infection, neonatal anemia.
Sample size
The sample size of the case cohort study was calculated according to the Spyridoula Maraka study[ 18]. The sample size was calculated to be 570, and considering 20% sample loss, the total sample size was 713.
Data Pre-processing
When the missing items of a variable exceed 20%, the variable is eliminated; Subjects with more than 20% missing items were excluded. If missing items are less than 20%, continuous variables will be imputed with missing values, such as mean or median; Categorical variable, assigned as No, assigned as Uncertain. In logistic regression analysis, the continuous variables of gestational age, number of hospitalizations, gravidity and parity were changed to binary classification. Term birth (gestational age 37–41 weeks) and preterm birth (gestational age 37 weeks)[no postterm birth (gestational age ≥ 42 weeks) in this study];1 and multiple hospitalizations; 1 and multiple pregnancies; Delivery 1 and multiple deliveries. Because forceps delivery mode only 1 parturient, so forceps delivery and episiotomy combined.
Statistical analysis
The 698 Maternity women and their newborns were randomly divided into two training sets and validation sets according to the ratio of 7:3. The training set established the model, and the validation set verified the model. There were 488 cases in the training set and 210 cases in the validation set.
At the same time, 698 pregnant women were divided into 2 groups according to the presence or absence of adverse outcomes, and 698 newborns were divided into 2 groups according to the presence or absence of adverse outcomes. (Twin pregnancies were treated as 2 women and their newborns). Any one of the maternal / neonatal adverse outcomes was adverse pregnancy outcome. We made a case report form to collect all the information of the patient in the hospital HIS system, including past medical history, family medical history, present medical history and so on. Logistic regression model was established based on single factor analysis. The stepwise regression based on the Akaike information criterion minimum was used to select variables for inclusion in the nomogram[ 19].
Model evaluation measures include AUC, sensitivity, specificity, and Brier score. Cut-off refers to the prediction probability P according to the model, if p ≥ Cut-off, the prediction is positive; Otherwise the prediction is negative.
Discriminatory power was assessed by the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. The calibration curve was used to evaluate the calibration capability. The calibration plot shows the predicted and actual probabilities for each patient in the nomogram, with a line close to the ideal 45° indicating a good correlation. Decision curve analysis (DCA) was performed to evaluate the clinical utility of the nomograms.
All P values were two-tailed, and p < 0.05 was considered statistically significant. All statistical analyses were performed using the R programming language and environment.