Retrieval of Transcriptome Data, Preparation, and Differentially Expressed Analysis
The data of RNA expression profiles and clinical information for GC were downloaded from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/repository), including 375 GC tissues and 32 non-tumor tissues. GTF files were downloaded from Ensembel (http://asia.ensembl.org) for annotation to distinguish the mRNAs and lncRNAs for subsequent analysis. A list of recognized immune-related genes (ir-genes) was downloaded from the ImmPort database (http://www.immport.org) and was used to screen irlncRNAs by a co-expression strategy. Correlation analysis was performed between ir-genes and all lncRNAs. Those with immune gene correlation coefficients more than 0.4 and p value less than 0.001 were considered as irlncRNAs. To identify the DEirlncRNA, we used R package Dseq2 for differential expression analysis among irlncRNAs. The thresholds were set as log fold change |FC| >2.0 along with a p value <0.05.
Pairing DEirlncRNAs
The DEirlncRNAs were cyclically singly paired, and a 0-or-1 matrix was constructed assuming C is equal to lncRNA A plus lncRNA B; C is defined as 1 if the expression level of lncRNA A is higher than lncRNA B, otherwise C is defined as 0. Then, the constructed 0-or-1 matrix was further screened. No relationship was considered between pairs and prognosis if the expression quantity of lncRNA pairs was 0 or 1 because pairs without a certain rank could not properly predict patient survival outcome. When the amounts of lncRNA-pairs of which expression quantity was 0 or 1 accounted for more than 20% of total pairs, it was considered a valid match.
Establishment of a Risk Model to Evaluate the risk score
Univariate Cox analysis was performed to assess the association between the expression levels of DEirlncRNA pairs and the overall survival (OS) of patients with a p value < 0.05. 81 pairs of DEirlncRNA pairs related to prognosis were selected. Then the least absolute shrinkage and selection operator (LASSO) model was used to find vital DEirlncRNA pairs from these 81 lncRNA pairs by utilizing the package glmnet in the R software as well as for construction of the model. Finally, 31 pairs of vital DEirlncRNA pairs were selected. The random forest plot was performed using the R package survminer. Preferable 10 of the 31 pairs were chosen and calculated the risk score by the formula: risk score=∑ki=1βiSi. Then the risk score for each sample was calculated based on the expression levels of the 10 DEirlncRNA pairs. According to the median risk score, GC samples were divided into high-risk group and low-risk group. 187 cases were classified into the high-risk group and 188 into the low-risk group. The AUC value of the model was calculated and drawn as a curve. The 1-, 2-, and 3-year ROC curves of the model were plotted.
Validation of the Constructed Risk Model
The Kaplan-Meier analysis was performed to single out the survival difference significantly associated with the OS from DElncRNAs pairs, which were selected by the LASSO method. The R packages utilized in these steps included survival, glmnet, pbapply, survivalROC, survminer, and pHeatmap.
To verify the clinical application value of the constructed model, we performed the chi-square test to analyze the relationship between the model and clinicopathological characteristics. Wilcoxon signed-rank test was used to calculate the risk score differences among different groups of these clinicopathological characteristics. To confirm whether the model can be used as an independent clinical prognostic predictor, we performed univariate Cox regression analyses between the risk score and clinicopathological characteristics. A forest map was used to demonstrate the results. The R packages utilized in these operations were survival, pHeatmap, and ggupbr.
Investigation of Tumor-Infiltrating Immune Cells
To analyze the relationship between the risk and immune-cell characteristics, we considered the currently acknowledged methods to calculate the immune infiltration statues among the samples from the TCGA project of the STAD dataset including XCELL, TIMER, QUANTISEQ, MCPCOUNTER, EPIC, CIBERSORT-ABS and CIBERSORT. The differences in immune infiltrating cell content explored by these methods between high-risk and low-risk groups of the constructed model were analyzed by Wilcoxon signed-rank test; the results are shown in a box chart. Spearman correlation analysis was performed to analyze the relationship between the risk score values and the immune infiltrated cells. The correlation coefficients of the results were shown in a lollipop diagram. The significance threshold was set as p <0.05. The procedure was performed using R ggplot2 packages.
Exploration of the Significance of the Model in the Clinical Treatment
To evaluate the model in the clinic for gastric cancer treatment, we calculated the IC50 of common administrating chemotherapeutic drugs in the TCGA project of the STAD dataset. Antitumor drugs such as cisplatin, docetaxel, paclitaxel, mitomycin and doxorubicin are recommended for GC treatment by AJCC guidelines. The difference in the IC50 between the high-risk and low-risk groups was compared by Wilcoxon signed-rank test and the results are shown as box drawings obtained using with pRRophetic and ggplot2 of R.
Analyses of the Immunosuppressive Molecules Expressing Related to ICIs
To study the relationship between the model and the expression level of genes related to ICIs, we performed ggstatsplot package and violin plot visualization.
Functional Enrichment Analysis
GESA and KEGG pathway enrichment analyses were performed in R using the function of clusterProfiler. The significance threshold was set at p < 0.05.