Overall, this study initially developed a prediction model for the inconsistency of driver gene mutations between NOG/PDX models and patients' samples. A total of 53 lung cancer NOG/PDX models were successfully engrafted and excised, including 42 NOG-PDX models with matching driver gene mutations from parents’ tumors and 11 NOG-PDX models with non-matching ones. To analyze this small unbalanced database, we performed three models of five algorithms, including LR based on AIC, LASSO-LR, SVM, XGBoost, and CatBoost, which all present with an excellent predictive capability. According to the evaluation indexes in testing groups, CatBoost and SVM demonstrated the best performance of and modeling. Moreover, the application of SMOTE generally improved the performance of SVM based on the fundamental level.
LR illustrated what exactly determined the genotypes of NOG/PDX models
LR is a widely ML technique in biomedical data analysis since it is reasonably easy to interpret with a clear demonstration of the positive or negative association of the variables with the predicted probability[25]. Therefore, the two multiple LR models revealed the critical predictor variables.
The formation and passaging of PDX models are dynamic events, where clonal and subclonal alternations frequently occur, especially when the development of P1 PDX models is slow, which gives adequate time for tumor cells to mutate for adaptation a new environment[26, 27]. In addition to these cell-autonomous heterogeneities, the stromal heterogeneity in the tumor microenvironment (TME) is another critical reason for different driver genotype of PDX from parental tumors[12]. As expected, most of the predictive features from univariable analysis consist of the factors associated with xenografts engraftment, except for the pathology. It was reported that SCC was much more prone to be tumorigenic in nude mice compared to adenocarcinoma ADC [18], which is contrary to our conclusion that SCC is the most challenging type to establish NOG/PDX models of genetic matching. More CD8+ TILs were detected in the cancer nests of SCC than in non-SCC [28], which reveals that PDX models of SCC may lose more tumor stroma during the xenograft engraftment. Moreover, SCC was prone to carry significantly more clonal mutations than AC[29], which may also contribute to more clonal selections.
As for multivariable analysis, age, the number of driver mutations, EGFR mutations, the type of prior chemotherapy, prior TKIs therapy, and the source shown a significant role in the inconsistency of driver gene mutations. Although age accounted for a small weight in multivariable LR, we have not found an appropriate therapy to illustrate a younger age rather than an elder age is a risk factor for the inconsistency of driver gene matching[30]. However, this surprising factor suggests that the age of implemented mice might play a critical role in the establishment of PDX models. Most of the PDX models in current researches use 8-week-old mice rather than aged mice (> 8 months). Recent studies found that aging could dramatically alter the components of the tumor microenvironment[31], thereby the inconsistent age of mice from patients could be the potential reason why age becomes a predictive feature here. Another feature, source, also played a negative role in matching the genotype, different from that in tumor engraftments. Although fluids source, including MPE and lymph, are proved to have a higher engraftment rate than the solid tumor tissues[32], we found that fluid-derived tumor xenografts were more challenging to maintain driver genotypes from parental tumors.
The number of driver gene mutations, including clonal and subclonal mutations, are associated with intra-tumor heterogeneity, genomic instability, or chromosomal instability[33]. The largest coefficient of the number of the driver gene in the multivariable LR model also illustrates its absolute importance in developing non-patient-matched genotypes. Secondly, PDX models from EGFR mutant lung cancers were reported with poor histological differentiation, and frequent loss of EGFR mutations[34], which supported the high inconsistent risk of EGFR mutant NOG/PDX models in this study. Thirdly, the evidence that pemetrexed increased the number of TILs, and upregulated immune-related genes related to antigen presentation might support the conclusion that PDX models from patients receiving pemetrexed are less likely to maintain the original genotypes[35]. TKIs have been proved with the capability to alter the pulmonary TME, including increased CD8+ T cells and mononuclear myeloid-derived suppressor cells (M-MDSCs) (CD11b+Ly6−G−Ly6Chigh), and fewer Foxp3+ T regulatory cells (Tregs) and M2-like macrophages (CD206+)[36]. Also, the clonal selection is a frequent occurrence during TKIs therapy, resulting in TKI resistance[37]. Interestingly, we found that the factors promoting TILs were conducive to the stability of genotypes during the NOG/PDX models establishment, which needs further verification (Fig. 6).
SVM-RFE and GBDT provide a robust and straightforward classifier
Unlike LR, both SVM and GBDT are similar to "black boxes," which only shows the inputs and outputs without internal workings[38]. SVM and GBDT are considered with the reliable power for classification, less concern for overfitting, and the ability to handle unbalanced data, which has been validated in this study. Thereby, when there is no need to explain the model in detail with an immediate requirement of building an accurate classifier, CatBoost, SVM, or SVM-SMOTE become a better choice for predicting the inconsistency of driver gene mutations with a significantly better performance.
ML for small biomedical unbalanced datasets
Recently, ML is a promising topic for predictive modeling in numerous areas, which enables prediction models to “learn” information systematically from initial data and adapt to each new data environment[39]. However, ML has not been widely performed in small sample databases (less than ten frequencies per predictor variable), which is a common characteristic in biomedical animal models with expensive costs and complicated techniques[40]. Ultimately, the ML algorithms we attempted to establish predictive tools for lung cancer NOG/PDX models demonstrated excellent performance, which not only provides a predictive tool to screen lung cancer patients for NOG/PDX models of precisive immunotherapy but also offers a general approach for building prediction models in small biomedical samples:
(1) Select features to develop a multivariable model in all samples with standard ML algorithms, including stepwise LR based on AIC, LASSO-LR, SVM (or SVM-SMOTE), XGBoost, CatBoost, and et al.
(2) Perform stratified random sampling to generate 100 training groups and testing groups to achieve stable performance.
(3) Formulate the predictive score or establish the predictive classifier in training groups.
(4) Evaluate the predictive model based on ROC, accuracy, and F1 score, in the corresponding testing groups to determine an optimal algorithm or modeling.
(5) Interpret the critical predictors for positive class by LR, and apply the optimal algorithm for the final prediction.