The present study aimed to assess the capability of radiomic features, either alone or combined with RWD, for predicting efficacy and survival outcomes of advanced NSCLC treated with ICIs. Two different endpoints were proposed to evaluate the predictive capability of radiomics: CBR, which is a mixed clinical-radiological endpoint, and survival status at pre-specified time points, namely 6 and 24 months (OS6 and OS24, respectively). Survival status endpoints are of particular clinical relevance, since they could identify specific class of patients ideally candidate for treatment escalation or de-escalation. Indeed, OS6 aimed at identifying early progressors, i.e., patients who do not benefit from ICI with a life expectancy < 6 months. In contrast, OS24 endpoint could help identify long-term survivors, i.e., patients particularly sensitive to ICI and potentially “cured” from ICI. Two sets of features, radiomics and a combination of radiomics and RWD, were analyzed and combined to predict the outcomes, and four different ML classifiers were trained for each of the three endpoints.
Overall, performances in predicting survival outcomes (OS6 and OS24) demonstrated significant superiority compared to predicting the clinical-radiological one (CBR), reflecting the importance of choosing the right endpoint in evaluating treatment efficacy, especially in the immunotherapy era. Indeed, several attempts have been done to overcome the classical radiological criteria in assessing the response of ICIs, such as including iRECIST instead of RECIST1.1 criteria in this context. However, IO typically exerts its antitumor activity in a more complex way than tumor volume reduction and patients can benefit from IO treatment even in absence of radiological changes.
Across multiple outcomes, the multimodal (radiomics plus RWD) feature set often outperformed the radiomics set, highlighting the importance of integrating RWD with refined data types such as images or biological omics. However, in some cases, such as OS24 models, including clinical data did not result in better prediction capability in terms of AUC. This observation suggests that, in some selected cases, radiomics features extracted from medical images could per se recapitulate clinical information, which can be safely omitted without losing prediction ability. Importantly, the radiomics and the multimodal model outperformed the capability of the clinically recognized IO biomarker, the PD-L1 expression on tumor specimen expressed as category group (< 1%, 1–49%, ≥ 50%), in predictive survival outcomes.
The highest predictive performance was obtained predicting long-term survival (OS24), with an accuracy of 0.71 and AUC of 0.79 on the test set, using the combination of 10 radiomics features and RWD and applying Logistic Regression as a classifier. Different studies attempted to predict the efficacy of ICIs in patients with advanced NSCLC using radiomics. In particular, lesion heterogeneity and non-uniform density patterns were associated with better response to ICIs, [35] while peritumoral heterogeneity and disorganized vasculature were linked to hyperprogressive disease. [14] Integrating radiomics with clinical data like albumin and lymphocyte levels further improved prediction performance over using a single data source. [36] Furthermore, the radiomic signature of immune infiltration based on CD8B expression proposed by Sun et al. was able to predict anti-PD-1/PD-L1 outcomes. [38] Finally, the multimodal integration of radiomics, histopathology, and genomics enabled early stratification of responders versus non-responders. [38, 39] However, information about radiomics features relevant for these models was often underreported, underlining the importance of XAI to enhance reproducibility across different clinical studies.
If confirmed in a larger prospective dataset, our results, which are consistent with the available literature, could have useful applications in clinical practice. Indeed, identifying those patients with a higher risk of rapid progression and death (class 0 in OS6 outcome) could be helpful in choosing the ideal candidates for treatment intensification, which could be pursued with the use of treatment schedules including a more aggressive chemotherapy regimen, or with the use of novel therapeutic approaches like antibody-drug conjugates. Conversely, patients with expected long-term survival (class 1 in OS24 outcome) under immunotherapy could be safely proposed treatment de-intensification, e.g., adopting a chemotherapy-free regimen, ideally sparing treatment toxicities without threatening survival outcomes.
We used XAI technique, in particular SHAP, to better understand which features most influenced the models in predictions. The 10 selected most important features for the best performing model were: ECOG PS, sex, metastatic stage (versus locally advanced disease), presence of brain and bone metastases and total number of metastatic sites among RWD features, Large Area Emphases, Elongation, Sphericity and Dependence Variance (see supplementary table for details) among radiomics features. This type of analysis is of particular importance in the hybrid models, where understanding which features among clinical and radiological are the most informative in doing the prediction is essential for clinicians and researchers to prioritize those sources of data that better recapitulate tumor biology. In this context, the most relevant clinical characteristics in our study are the ones well-known to be associated with prognostic impact on cancer patients’ survival. The consistency of these results with clinical knowledge contributes to making these models trustworthy by the scientific community, which is an essential step for the implementation of AI-based models in future clinical practice, with full human oversight. Regarding radiomics features listed in the resulting SHAP graphs, while some of them are intuitively associated with prognostic outcomes, such as major axis length (which approximates to “T” dimension in TNM classification), others cannot be visualized by the human eye and, therefore, are barely associated with clinical outcomes. However, although attributing specific radiomic features to tumor biology is challenging, e.g., the Large Dependency Emphasis feature, could be linked with biologic intratumor heterogeneity, aligning with literature linking tumor heterogeneity to different immune activation and immunotherapy response. [36]
The present study has some obvious limitations. Firstly, the retrospective nature of the study, which foresaw the use of CT scan images performed during a wide timeframe (2013–2022), could not guarantee the uniformity of radiological images collected, potentially introducing noise during radiomic feature extraction. However, this potential pitfall could strengthen the generalizability of the results since it could allow the reproduction of the prediction also on external sets of images without potentially losing discrimination performance. In addition, the heterogeneity of the patient cohort, which included patients treated with ICIs in different treatment lines and with different strategy (with or without chemotherapy), could limit the applicability of the model since nowadays almost the totality of NSCLC patients receives ICIs as first-line treatment for advanced disease. The small sample size may have also influenced the results, warranting a larger, more homogeneous patient cohort to validate the model.
Our model can be further improved. Firstly, applying specific models to predict continuous survival outcomes could result in better prognosis prediction at the patient’s level. Furthermore, Deep Learning workflow could be adopted to build an end-to-end model in a non-handcrafted way to test if this approach could further improve the prediction ability of the established ML-based models. In addition, integrating RWD and radiomics with other data sources, like genomics and digital pathology, could provide better insights into tumor biology and, therefore, even increase model performances, as already demonstrated in literature. [38, 39] Another potential source of data could also be represented by the prospective assessment of medical images, such as the implementation of the delta-radiomic features (e.g., the extraction and fusion of features from CT/PET scan performed as baseline and at first radiological evaluation). [40] To pursue this aims, we are conducting the international, retrospective-prospective I3LUNG study (NCT05537922), to use data from multiple sources to build a comprehensive predictive model for advanced NSCLC patients treated with ICIs. [41]