Landscape of two different TIME patterns in the TCGA cohort
A summary of the immune cell composition of the 67 MSI-H gastric cancer tissues in the TCGA cohort is shown in Supplemental Fig. 1. The five most common immune cells were M0 macrophages, resting memory CD4 + T cells, M2 macrophages, CD8 + T cells, and M1 macrophages in descending order (Fig. 1A). NbClust provides 30 indexes for determining the optimal number of clusters; for the cluster analysis of TIME components, among the range of tested clusters (2–15), two clusters were considered optimal, as supported by six indexes (Supplemental Fig. 2). Based on the k-means method, two distinct TIME pattern landscapes were evident, as shown in Fig. 1B. The first pattern was characterized by an increase in the composition of resting memory CD4 + T cells, resting NK cells, M0 macrophages, activated mast cells, and neutrophils; whereas, the second pattern had a notable increase in the composition of CD8 + T cells, activated memory CD4 + T cells, regulatory T cells, gamma delta T cells, activated NK cells, M1 macrophages, and resting mast cells.
Functional annotation of the DEGs and subtyping of MSI-H gastric cancer
Using the limma R package, 77 DEGs were identified among the two MSI-H TIME patterns were of the TCGA cohort; of these, 65 DEGs were upregulated in the second TIME pattern and 12 DEGs were upregulated in the first TIME pattern. The detailed results are shown in Supplemental Table 1. GO analysis using the clusterProfiler R package revealed that the 65 upregulated DEGs in the second pattern were mainly enriched in terms related to immune response, defense response, response to stimulus, and T cell activation, whereas the 12 upregulated DEGs in the first pattern were mainly enriched in inflammatory response and collagen catabolism terms (Fig. 1C and 1D). The NMF algorithm was applied to the 65 upregulated DEGs in the second pattern, and the optimal number of clusters was selected where the magnitude of the cophenetic correlation coefficient begins to fall (Supplemental Fig. 3). Then, the TCGA cohort fell into two distinct subgroups; 29 tumors were classified as MSI-H subtype 1 (MSI-S1) and 38 tumors were classified as MSI-S2 (Fig. 2A). Clinical characteristics, including sex, age, tumor location, histology, Lauren type, and stage, were similar between the two MSI-H subtypes (Table 1).
Table 1
Clinicopathological characteristics in the TCGA, ACRG, and PUCH cohorts.
| TCGA cohort | ACRG cohort | PUCH cohort |
| MSI-S1 | MSI-S2 | P value | MSI-S1 | MSI-S2 | P value | MSI-S1 | MSI-S2 | P value |
Age | | | 0.783 | | | 0.743 | | | 0.185 |
< 70 | 12 (41.4) | 17 (44.7) | | 22 (66.7) | 22 (62.9) | | 12 (70.6) | 13 (92.9) | |
≥ 70 | 17 (58.6) | 21 (55.3) | | 11 (33.3) | 13 (37.1) | | 5 (29.4) | 1 (7.1) | |
Sex | | | 0.325 | | | 0.346 | | | 0.068 |
Male | 11 (37.9) | 19 (50.0) | | 20 (60.6) | 25 (71.4) | | 9 (52.9) | 12 (85.7) | |
Female | 18 (62.1) | 19 (50.0) | | 13 (39.4) | 10 (28.6) | | 8 (47.1) | 2 (14.3) | |
Location | | | 0.876 | | | 0.909 | | | 0.733 |
Upper | 2 (6.9) | 3 (7.9) | | 2 (6.1) | 2 (5.7) | | 4 (23.5) | 3 (21.4) | |
Middle | 10 (34.5) | 10 (26.3) | | 8 (24.2) | 7 (20.0) | | 1 (5.9) | 2 (14.3) | |
Lower | 15 (51.7) | 23 (60.5) | | 23 (69.7) | 26 (74.3) | | 12 (70.6) | 9 (64.3) | |
Unknown | 2 (6.9) | 2 (5.3) | | | | | | | |
Stage 6th | | | 0.735 | | | 0.772 | | | 0.722 |
Ⅰ or Ⅱ | 13 (44.8) | 18 (47.4) | | 20 (60.6) | 20 (57.1) | | 8 (47.1) | 8 (57.1) | |
Ⅲ or Ⅳ | 13 (44.8) | 18 (47.4) | | 13 (39.4) | 15 (42.9) | | 9 (52.9) | 6 (42.9) | |
Unknown | 3 (10.3) | 2 (5.3) | | | | | | | |
Histology | | | 0.511 | | | 0.188 | | | 0.999 |
Adenocarcinoma | 9 (31.0) | 16 (42.1) | | 32 (97.0) | 30 (85.7) | | 14 (82.4) | 11 (78.6) | |
Mucinous or Signet ring cell carcinoma | 7 (24.1) | 10 (26.3) | | 0 (0) | 3 (8.6) | | 3 (17.6) | 3 (21.4) | |
Unknown | 13 (44.8) | 12 (31.6) | | 1 (3.0) | 2 (5.7) | | | | |
Lauren type | | | 0.421 | | | 0.051 | | | 0.699 |
Intestinal | 10 (34.5) | 19 (50.0) | | 20 (60.6) | 23 (65.7) | | 8 (47.1) | 8 (57.1) | |
Diffuse | 6 (20.7) | 7 (18.4) | | 8 (24.2) | 12 (34.3) | | 6 (35.3) | 3 (21.4) | |
Mixed | 13 (44.8) | 12 (31.6) | | 5 (15.2) | 0 (0) | | 3 (17.6) | 3 (21.4 | |
TCGA the Cancer Genome Atlas, ACRG Asian Cancer Research Group, PUCH Peking University Cancer Hospital, MSI Microsatellite Instability |
Furthermore, we also validated the obtained DEG subtyping results using the NMF method in the ACRG and PUCH cohorts. Analysis of the optimal number of clusters in both cohorts suggested the presence of two subgroups; among the 68 samples of the ACRG cohort, 33 tumors were classified as MSI-S1 and 35 tumors were classified as MSI-S2; whereas the 31 samples of the PUCH cohort were distributed as follows: 17 MSI-S1 and 14 MSI-S2 (Supplemental Fig. 4–7). Interestingly, the ratio of MSI-S1 to MSI-S2 in all three cohorts was close to 1:1.
Gene expression profile, TIME and somatic mutation characteristics of MSI-H subtypes in the TCGA cohort
The cytolytic activity score (CYT), based on the expression of granzyme A (GZMA) and perforin (PRF1), was associated with the antitumor ability of cytotoxic lymphocytes in the TIME. The CYT was calculated, according to the previous study[17], to compare MSI-S1 and MSI-S2. The expression levels of immune checkpoints, including CD274 (PD-L1), IDO2, PDCD1LG2 (PD-L2), CTLA4, IDO1, ADORA2A (A2AR), LAG3, PDCD1 (PD1), TIGIT, HAVCR2 (TIM3), VISTA (C10orf54), and VTCN1 (B7-H4), were also compared between the two subtypes. The T cell-inflamed signature, including IRF1, CD8A, CCL2, CCL3, CCL4, CXCL9, CXCL10, ICOS, GZMK, HLA-DMA, HLA-DMB, HLA-DOA, and HLA-DOB as proposed by Gajewski et al.[18], which is considered to predict immunotherapy response, was evaluated in this study. MSI-S1 was associated with high CYT (P < 0.001) and high T cell-inflamed signature expression (Fig. 2B, 2E); MSI-S1 also exhibited significantly high expression of most of the immune checkpoint genes, such as PD-L1 (P < 0.001), PD-L2 (P < 0.001), CTLA4 (P < 0.001), PD-1 (P < 0.001) (Fig. 2D). Comparison of the TIME components showed high levels of CD8 + T cells, activated memory CD4 + T cells, and gamma delta T cells, which suggested an active antitumor TIME in the MSI-S1 subtype, whereas high levels of M0 macrophages and activated mast cells, which may associated with immunosuppression, were detected in the MSI-S2 subtype (all P < 0.05, Fig. 2C). The above analysis showed that the MSI-H gastric cancer subtypes have distinct TIME patterns and suggest potential different response to immune checkpoint inhibitors.
By analyzing somatic mutation data, the top 20 frequently mutated genes of the two subgroups were noted, as shown in Fig. 3A. We identified 130 mutation genes with significant differences between the two subtypes (Supplemental Table 2); 13 mutation genes with P < 0.01 were shown in Fig. 3B. Previous studies have reported the association between ABCB4 and PIK3R2 and drug resistance in gastric cancer[19, 20]; these data may help us explore the mechanism of their impact on the TIME in the future. In addition, some studies have suggested that the level of TMB or MATH could reflect the tumor heterogeneity and that they were associated with immunogenic antigen density. Moreover, a comparison of the level of TMB and MATH between the two subgroups showed no significant difference (P = 0.501; 0.621, respectively) (Fig. 3C and 3D). This result may suggest that the suppression of antitumor TIME in MSI-H gastric cancer did not result from the lack of antigens. Furthermore, we analyzed and compared the level of fibroblasts and the stromal score between the two subgroups and noted no significant differences (P = 0.105; 0.056, respectively) (Fig. 3E and 3F). In other words, there may be some molecular pathway mechanisms that affect the composition of the TIME in MSI-H gastric cancer.
Validation of the gene expression profile and TIME characteristics in the ACRG and PUCH cohorts
To further validate the versatility of the subtypes based on the DEGs, we performed the corresponding analysis in the ACRG and PUCH cohorts. For the ACRG cohort, some immune checkpoint genes had significantly high expressions in the MSI-S1 subgroup, including PD-L1 (P < 0.001), PD-L2 (P < 0.001), CTLA-4 (P < 0.001) (Fig. 4A). Moreover, TIME component analysis revealed significantly high proportions of CD8 + T cells, activated memory CD4 + T cells, and M1 macrophages in the MSI-S1 subgroup and high proportions of resting memory CD4 + T cells, and M0 macrophages in the MSI-S2 subgroup (all P < 0.05, Fig. 4B). As for the PUCH cohort, we saw a similar expression profile with high levels of PD-L1 (P < 0.001), PD-L2 (P < 0.001), and CTLA-4 (P < 0.001) along with a high proportion of CD8 + T cells (P = 0.003) in the MSI-S1 subgroup (Fig. 4C, 4D). In addition, the T cell-inflamed signature was highly expressed in both of the ACRG and PUCH cohorts (Fig. 4E,4F). Notably, the expression of immune-related genes and the TIME component characteristics were consistent in all three cohorts.
GSEA and survival analysis
GSEA was performed based on gene expression data from the MSI-S1 and MSI-S2 subgroups; the top 20 enriched pathways in the MSI-S2 subtype of all three cohorts are shown in Supplemental Table 3. Based on the top 20 pathways, we analyzed the common-enriched pathways in the three cohorts; bile acid metabolism, downregulation of K-ras signaling, and WNT/β-catenin pathways were enriched in MSI-S2 of all three cohorts (Fig. 5).
The Kaplan-Meier method revealed no significant difference in patient survival between the two subgroups (MSI-S1 and MSI-S2) in the ACRG and PUCH cohorts (Fig. 6A, 6D). However, we conducted further stratified analysis based on whether adjuvant chemotherapy was received; patients in the MSI-S1 subgroups benefited from adjuvant chemotherapy (P = 0.043; 0.050, respectively) (Fig. 6B, 6E), whereas those in the MSI-S2 subgroups were not affected by adjuvant chemotherapy (Fig. 6C, 6F).