The metabolome, genome, transcriptome and proteome are core components of systems biology, and their complicated interactions crucially influence cancer development and metastasis 31. In the field of biology, machine learning algorithms have been widely used to analyse multiple high-throughput data, providing new methods to solve complex biological problems 31. Immunotherapy has been considered to be a prevailing therapeutic approach for LUSC, however, the knowledge of biomarkers that predict immune response remains relatively lacking, which greatly limits the development of immunotherapies for LUSC. Li et al. analysed the SEER database data on demographic characteristics, diagnosis time, and treatments of LUSC patients via six machine-learning algorithms to forecast the prognostic status and treatment response of LUSC, further promoting the clinical management and improvement of treatment outcomes for LUSC 34. Based on the TCGA database, Chen et al. used four different machine learning methods to screen five hub genes to construct prediction models for clinical applications of hepatocellular carcinoma 35. Besides, machine learning algorithms are an effective tool in analyzing the potential connections and biological significance of multi-omics data 36. Chu et al. offered research basis for early diagnosis and timely treatment of muscle-invasive urothelial cancer patients by combining multi-omics data and multiple machine learning algorithms 17. However, there are still fewer multi-omics studies on LUSC. Hence, we applied machine learning analysis to the integrated multi-omics data of LUSC to further filter biomarkers and drug therapy targets, providing new possibilities for delivering personalised treatments.
TME is a highly complex local environment including immune cells, tumour cells, epithelial cells, and stromal cells 37. Complex interactions between tumour cells and multiple constituents in the TME become central drivers of tumour metastasis and accelerated cancer progression 38. Within the same tumour, enormous variations may exist in terms of category, intrinsic characteristics, stage, and the TME status. It has been observed that the heterogeneity of tumours leads to remarkable differences in the treatment response of patients with the same type of tumour, such as LC 39. There are marked differences at the molecular level among different individuals of the same tumour or among different regions of the same tumour 39. Therefore, in-depth investigation of the TME can help to reveal the dynamic evolution of tumours, and unravelling the underlying molecular mechanisms of tumour heterogeneity is crucial for the realisation of precision medicine and the improvement of patient prognosis. In recent years, owing to the rapid advances in molecular biology and artificial intelligence, our recognition of tumour heterogeneity has gradually deepened. We analyzed the integrated sequencing data through multi-omics methods to deeply explore the mechanism of heterogeneity in the process of tumor development, aiming to offer novel insights into the individualised tumours treatment and to enhance the quality of patient's survival.
In the study, we gained multi-omics data on LUSC from the TCGA and GEO databases. First, we performed the "ComBat" function to merge diverse datasets to remove the batch effect of each datasets. We utilized PCA analysis to verify the efficacy and robustness of merging the diverse datasets. Besides, we identified two subtypes (CS1 and CS2) from ten multi-omics integrated clustering algorithms.
We adopted the "getElite" function to filter the gene features, and we found that CS2 showed the most favourable survival outcome among all subtypes. Immune infiltration analysis revealed that immune cell infiltration was remarkably increased in the CS1 group, whereas it was relatively low in the CS2 group. Subsequent analysis has revealed that CS2 patients have a heightened potential for deriving therapeutic benefits from treatment modalities such as radiation therapy and targeted therapy. Since some sets had smaller samples, we integrated them into the META-LUSC set for the next analyses. We screened 108 PRGs for the construction of the CMLS. With StepCox and CoxBoost algorithms, we filtered out 24 hub genes and performed the final model construction. We divided the sample into different CMLS groups and performed survival analyses to assess their prognostic value.We found that high-CMLS group had a worse clinical prognosis. By comparing CMLS with 12 other signatures, we observed that CMLS has better C-index performance, again demonstrating the robustness and reliability of CMLS. Moreover, we evaluated the immunological landscape of LUSC using "IOBR" package. Low-CMLS group exhibited significantly higher levels of immune cell infiltration, including NK cells, T cells and B cells, suggesting that they may have better survival outcomes. Further prediction of immune response outcomes also demonstrated that the low-CMLS group displayed dramatically improved prognosis and more sensitive immune responses. Finally, we screened potential therapeutic drugs to inform targeted drug treatment strategies for LUSC.
Numerous studies have found that during the construction of predictive models, the model behaved well in the training set but may behave poorly in validation sets, and further analyses have revealed that this could be related to overfitting of the model 41. To effectively mitigate the bias in the assessment of truthfulness performance due to model overfitting, we adopted an integrated modelling strategy, whereby the constructed model is placed in multiple sets to enable more objective evaluation of its truthfulness performance with respect to both the training and validation sets. Referring to previous studies 17, the mean value of the C-index was used as a model evaluation criterion to synthesise the evaluation results of the models built in each set to construct a highly credible prediction model, thus achieving more accurate prediction results for LUSC. The infiltrating growth of immune cells in TME performs various roles in the process of tumourigenesis, thus influencing tumour progression and the outcome of cancer treatment 42. Whereas the composition of immune cells varies from one tumour to another, we need to analyse the proportions of immune cells that are responsive to different tumour types in order to explore the underlying mechanisms of tumour development. At present, there are many methods and tools for investigating immune infiltration, however, none of their tools are universally applicable and cannot produce relatively reliable results, thus affecting our understanding of the tumour. The emergence of the IOBR package provides a reliable analytical tool to explore integrated multi-omics data on tumour immune interactions and characterisation of TME 43. Here, we adopted IOBR package to comprehensively explore and analyse TME, and correlated the analysis results with clinical information to provide reference value for the clinical treatment of LUSC. The low-CMLS group was more sensitive to immunotherapy and had more abundant immune cell types, indicating that CMLS might forecast the prognosis and immune response of LUSC in some degree.
Nevertheless, our study remains certain deficiencies. First, we included data from multiple sequencing platforms in our analyses, and although we did our best to minimise the differences between them, differences between the data were still inevitable. Second, the analyse results were not subjected to further experimental validation and mechanistic investigation in this study.