Study design and participants
In this international retrospective observational study, we used data from two cohorts of adolescents living with obesity and submitted for the first time to bariatric surgery (RYGB, SG and AGB), one from France gathering patients from 5 academic centers, and one from Sweden based on the national registry. Participants were aged 12 to 20 years, and were followed up to 5 years after surgery. If patients had a subsequent bariatric intervention within 5 years following the first one, data were censored at the time of reoperation. Data points for patients lost to follow-up were kept in the analysis but censored after the last visit.
In France, data was collected in the following University Hospitals Lille, Angers, and Paris (Bicêtre, Robert-Debré, Necker and Armand-Trousseau Hospitals) for the period from March 2001 to December 2022. Data from all patients under 18 years came from electronic or paper medical records. In Lille, we also enrolled participants from the ABOS cohort (NCT01129297) aged between 18 and 20 years. In order to reduce the number of missing data, some patients were also contacted by telephone and e-mail, particularly for long term follow-up. In Sweden, virtually all patients undergoing metabolic, bariatric surgery are registered in the nationwide research and quality registry, SOReg (the Scandinavian Obesity Surgery Registry). The registry is continuously validated and has so far been shown to have very high validity of data (20). Anonymous data from the SOReg were included in the current study.
The study protocol was approved by the Ethics Review Committee (n°2023-070) of University of Lille. Non-opposition forms were sent out to all French participants by post. In Sweden, all patients are informed of the registry and that research will be conducted based on information from the registry (“opt-out”). Given the anonymous extraction of data, no additional ethics approval in Sweden was necessary.
Data analyzed for each patient at baseline included the seven clinical items which have been shown to be the most relevant for predicting weight loss in adults: height, weight and age at surgery, smoking status, diagnosis and duration of T2D. Smoking status was defined as 'smoking' or 'nonsmoking' at surgery. We then analyzed postoperative height and weight measured at 12, 24, 36 and 60 months. Total weight loss in percent was calculated as:
TWL = (visit weight – preoperative weight)/preoperative weight × 100
The preoperative weight is those, assessed before preoperative weight reduction.
Study outcomes
The primary outcome was the accuracy of 5-year BMI prediction, expressed as the median absolute deviation (MAD), between predicted and observed BMI expressed in kg/m². Secondary outcomes were weight loss prediction accuracy at earlier postoperative visits (at months 12, 24), expressed by MAD and normalized MAD as percentage of BMI.
Adaptation of the adult prediction model to teenagers’ cohort
For the purpose of this study, the French and Swedish cohorts were first merged into a single dataset, which was further divided into two subsets as follows. Among each of the three intervention subgroups (AGB, SG, RYGB), 80% of patients were randomly selected to form the training subset, and the remaining 20% were assigned to the testing subset. This assignment criterion ensures that both the training and testing subsets reflect the repartition of interventions observed in the merged cohort.
Following the methodology developed for adults, we then trained a class of machine learning algorithms called decision trees on the training subset to, first, learn meaningful subgroups of patients that share statistical similarities in their baseline characteristics, and second, to fit a TWL prediction model for each subgroup. As detailed in the initial report (19), decision trees are non-linear predictors that are able to learn customized predictions depending on the type of bariatric interventions, as well as other clinical variables.
To specialize the tree-based prediction model to teenagers, we used the seven attributes that were previously selected by the LASSO (least absolute shrinkage and selection operator) analysis performed in the adult study (19), namely type of intervention, preoperative weight, height, age at intervention, smoking status, diabetes status and duration, and we subsequently applied the regression trees (CART) algorithm on the training subset to learn adolescent-specific subgroups and predicted TWL. We further compared the predicted weight losses to the observed outcomes of patients in the testing subset.
Statistics
The baseline variables used to predict the weight loss trajectories were extracted from the retrospective French and Swedish patients’ medical records using the least absolute shrinkage and selection operator (LASSO) algorithm (21). The prediction model consisted of decision trees, trained at each outcome date (month 12, 24 and 60 after bariatric surgery) to predict percent of total weight loss (TWL) from the baseline variables, using the classification and regression trees (CART) algorithm (22). Predicted TWL were then converted into predicted weight and BMI.
The accuracy of the predicted outcomes with respect to the true outcomes was reported in terms of MAD. 95% confidence intervals (CI) for MAD were estimated by bootstrap (BCa method, n=10,000 replications). The predictive model for adolescents was developed by comparison with the adult predictive model, itself validated by comparison with other predictive models in the literature. Comparison is made with MAD.
We also used Bland-Altman plots to evaluate model calibration and 1-sample t-tests to assess the presence of systematic bias. Weight and BMI median (Interquartile Range [IQR]) 5-year trajectories of participants submitted to each operation were illustrated as a function of time using a nonlinear smoothing.
Subgroup analyses were performed by splitting the study cohort by intervention, pregnancy status, age (younger vs. older than 19 years old) and AGB removal.
The analysis was performed using the R software version 4.3.1, the library rpart for the CART implementation. We used the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD AI) guidelines to report the prediction model’s development and validation.