Clinicopathologic Characteristics of selected patients
Among the 286 patients passed initial filtering, 51 (17.8%) of them were LEP-predominant, 178 (62.2%) were ACI-predominant, 16 (5.6%) were PAP-predominant, 29 (10.1%) were MIP-predominant, and 12 (4.2%) were with SOL-predominant adenocarcinoma. The clinicopathologic characteristics of all patients were summarized in Supplementary Table 1. As shown in Supplementary Table 2, the 5-year DFS rates were 82.9%, 67.1%, 68.6%, 57.7% and 50.0% while the 5-year OS rates were 92.2%, 66.3%, 75.0%, 25.0% and 48.3% respectively for LEP-, ACI-, PAP-, MIP-, and SOL-predominant patients. Patients with MIP-predominant subtype exhibited a significantly worse DFS than those with other adenocarcinoma subtypes (P<0.05, Supplementary Figure 1A) and LEP and MIP subtypes exhibited largest DFS difference while SOL-predominant patients showed significantly worse OS (P<0.05, Supplementary Figure 1B).
Mutational landscape exhibits the involvement of distinct biological processes in LEP and MIP lesions
Among 8 micro-dissected samples, 6 cases passed quality control, but the quantity of LEP component in one case was not enough. 6 MIP and 5 LEP components were finally sequenced (Supplementary Table 3 and Figure 1A). A total of 2035 and 2757 SNVs and InDels were identified while 684/791 and 257/284 mutations were retained after quality and cancer-related gene filtrations (Supplementary Figure 2A and data quality metrics in Supplementary Table 4). Genes with top mutation frequency after quality filtering was shown in Figure 1B. EGFR was identified to be the most frequently mutated drive gene, in line with the finding that LEP and MIP components possess significantly higher EGFR mutation frequency(31). Besides, several cancer-associated genes including TP53, TRIO, CEBPA, PCLO and PDE4DIP were concomitantly mutated (Figure 1B), denoting p53, WNT-beta-catenin signaling, PI3K/AKT/mTOR signaling and DNA repair pathways were affected. Interestingly, shared mutations were observed between paired LEP and MIP components from single patients (Figure 1B), raising the presumption that the paired LEP and MIP components could be homogenious. We also compared the mutation frequency of these genes with public cohorts, including Lung-Broad, Lung-OncoSG and TCGA-LUAD. Several cancer-associated genes including EGFR, TP53, TRIO, PCLO and PDE4DIP were found recurrently mutated (Supplementary Figure 2B).
Additionally, mutation signature analysis was conducted on unfiltered mutations separately for LEP and MIP components. The point substitution spectrum plot displayed insignificant difference between the two histological subtypes (Supplementary Figure 2C). Similarly, the SBS, InDel and DBS signatures mapped to COSMIC database (accessed in March 2021) were similar between the two subtypes (Figure 1C-D), denoting the histological differences between LEP and MIP components could be caused by alternations on specific key genes.
Of particular interest, mutated tumor suppressor genes (TSGs) were enriched in distinct pathways in LEP and MIP subtype (Figure 1E). TSGs from LEP components were enriched in DNA repair and TP53-related pathways while mutated TSGs in MIP components were found enriched in pathways associated with beta-catenin destruction complex, AXIN mutation and WNT signaling, which shared high consistency with previous report(4). When concerning the enriched pathways for mutated oncogenes, 7 out of 10 top enriched Reactome pathways from two groups were identical, which were mainly associated with EGFR and PI3K signaling (Supplementary Figure 2D). By further inspecting the mutated TSG pathway enrichment pattern in Lung-Broad (Supplementary Figure 3A-B), Lung-OncoSG (Supplementary Figure 3C-D) and TCGA-LUAD (Supplementary Figure 3E-F) cohorts, similar entries were identified in both LEP and MIP samples while NOTCH1-related pathways were addedly found in TCGA-LUAD MIP samples, which was unsurprising since the cross-talk between NOTCH and WNT pathways was previously unveiled(32). As for the oncogenes mutated in three public cohorts, shared terms were found between LEP and MIP components in Lung-Broad (Supplementary Figure 4A-B), Lung-OncoSG (Supplementary Figure 4C-D) as well as TCGA-LUAD (Supplementary Figure 4E-F) cohort but with lower overlapping proportion, again endorsed the possible existence of common mutational ancestor in the paired components.
Copy number alternation and clonality analysis uncovered distinct ITH characteristics in LEP and MIP subtypes
Through the segment level copy number alternation identification procedures, multiple amplified and deleted segments were detected (Supplementary Figure 5A). Chromosomes including 3,4,5,10,15,17 and 18 exhibited different copy number alternation patterns between two subtypes and the MIP subtype showed both higher chromosome level copy number variation (CNV) burden (Supplementary Figure 5B) and arm level CNV burden (Supplementary Figure 5C), which was consistent with previous report(33). To further pinpoint the recurrent SCNAs at focal level, we identified 1116 genes with somatic copy number alternations through statistical testing on read coverages from all samples, among which 159/80 genes were uniquely amplified/deleted in LEP component while 34/11 genes were uniquely amplified/deleted in MIP component. By further annotating the enriched pathways on these genes, 27 pathways were found overlapping between enrichment results of uniquely amplified genes in LEP and deleted genes in MIP, which could further be categorized into immune system, innate immune system, interleukin signaling, SHC1 events, ERK activation and FRS-mediated signaling pathways (Figure 2A). When further inspecting the number of genes, pathways related to immune system and innate immune system got the highest gene number variated (37 genes amplified in LEP and 4 genes for deleted in MIP subtype), indicating MIP LUADs tend to have induced immunosurveillance escape. Additionally, two pathways were identical between the enrichment results of uniquely deleted genes in LEP and amplified genes in MIP (Figure 2B), which were associated with Homology Directed Repair (HDR) and mRNA fate regulation but the variated gene number was limited (6 genes for LEP- and 4 genes for MIP group).
The intratumor heterogeneity (ITH) can depict the genetic and epigenetic tumor inner diversity and was proven to be closely related to cancer progression, therapeutic resistance and recurrences. To compare the ITH of the two histological subtypes at both mutational and copy number level, we annotated the mutations/SCNAs with clonality. As shown in Figure 2C, MIP group has higher clonal tumor mutation burden (cTMB) and lower subclonal mutation proportion (Figure 2D), denoting the lower mutation level ITH of the MIP subtype. The MATH score, which was widely used to measure the mutational ITH, exhibited similar trend (Supplementary Figure 5D). As for the copy number variations, MIP group possessed significantly higher proportion of subclonal SCNAs (Figure 2E) as well as subclonal genome fraction (Figure 2F). Interestingly, the frequency of clonal mutations in DNA Damage Response (DDR) and WNT pathway genes was higher in MIP subtype (Supplementary Figure 5E), which possibly partial give rise to the subclonal genome alternations on immune-related genes since the association between canonical WNT-beta-catenin signaling and carcinogenesis as well as immune suppression was clear(34). Indeed, 6 MIP components got higher percentage of subclonal SCNA (Figure 2G) as well as higher number of focal deletions (Figure 2H) on the genes related to the two immune pathways.
Evolutionary pattern exploration on the paired LEP and MIP components
To elaborate the possible evolutionary process between LEP and MIP subtypes, we delineated the phylogenetic trees for each patient based on mutations as well as focal level SCNAs. As shown in Figure 3, all 5 patients possessed truncal mutations between paired LEP and MIP components while no obvious bias on private mutation burden after truncal divergence was observed. Clonal mutations on cancer drivers including EGFR, TP53 and CEBPA were identified and EGFR was the only gene coincident in 5 pairs, which confirmed the presence of ancestral mutations. Additionally, the driver mutations private to LEP were enriched in chromatin organization, TP53-related and DNA double strand repair pathways (Supplementary Figure 6A) while mutations private to MIP were enriched in cellular signaling and beta-catenin-related pathways (Supplementary Figure 6B). We further annotated the shared mutations in Figure 3 with clonality to explore the clonal-subclonal transitions between LEP and MIP subtype. For the genes possessing mutations with increased clonality in MIP, GO terms related to neurogenesis were found enriched (Supplementary Figure 6C), denoting the tumor-induced neurogenesis and nerve-cancer crosstalk may account for the aggressiveness of MIP subtype. Oppositely, genes with mutations switched from subclonal to clonal in LEP were associated with cell-cycle related GO biological processes (Supplementary Figure 6D). As for the truncal focal SCNAs, several driver genes including CSMD3, SPTAN1, BCORL1, CAMTA1, GRIN2A, MED12 and TRAF7 were concurrently amplified in the two subtypes (Figure 3), which was associated with developmental biology (R-HSA-1266738) and EGFR-related Reactome pathways (R-HSA-179812 and R-HSA-180336). Moreover, the deletion of TP53, MUC4, ARID5B, ANK1, PTEN, SFPQ, FANCA, MAF and ZFHX3 were observed in the two subtypes. Interestingly, no driver gene showed concordant copy number variation in 5 pairs of samples, possibly due to the elevated SCNA-level ITH in MIP group. We also scrutinized the genes with shared copy number variation in the paired samples. As shown in Supplementary Figure 7A, most shared deletions were on immune-related genes, while signal transduction and PI3K/AKT pathways, which abnormality is highly associated with tumor progression and therapeutic resistance, were found uniformly amplified (Supplementary Figure 7B). To further derive the mutational transitions and evolutionary trajectory, we used PyClone-VI to infer the mutational populations and their evolution from paired components. As shown in Supplementary Figure 8A, numerous clone clusters were identified in 5 patients, which exhibited dynamic variant allele frequency (VAF) alternation. Clusters with drastically increased VAF in LEP subtype were mainly enriched in mRNA splicing pathways (Supplementary Figure 8B) while clusters with increased VAF in MIP subtype were associated with ERBB2 functions (Supplementary Figure 8C). These data imply that LEP and MIP components from one patient were derived from same initiation cells and the pathway-specific mutations acquired after EGFR clonal mutation eventually shaped the subtype-specificity.
Group-wise comparison discovered possible novel therapeutic targets for MIP histological subtype
For the sake of the derivation of histological subtype-specific therapeutic targets, we next gathered SNV and SCNA data from the LEP/MIP (both LEP and MIP) groups and identified the genes with highest alternation frequency difference in all samples. As shown in Figure 4A, large mutation frequency difference was observed on 9 genes, with 3 genes specifically mutated in LEP group. Additionally, 5 genes were found with distinct copy number alternation pattern (Figure 4B), with 1 gene specifically amplified in LEP group. Similarly, mutation frequency on the 9 genes plus EGFR were compared between 4 public cohorts (Figure 4C) and EGFR was the only gene with significant mutational difference in non-east Asian public cohorts. Moreover, all the 5 SCNA group-specific gene showed alternation frequency difference relatively in 3 non-east Asian cohorts (Figure 4D). Interestingly, the altered sample proportion or the alternation frequency for the MIP-group-amplified gene PTP4A3, NAPRT and RECQL4 was highly similar in our samples (Figure 4B) and public cohorts (Lung-Broad, Lung-OncoSG and TCGA-LUAD, Figure 4D), implying the feasibility of their cooperative function through duplication in MIP component. Spearman’s correlation coefficient (SCC) on SCNA pattern of our 11 samples confirmed the association of the co-amplified genes (Supplementary Figure 9A). Such strong association was also observed on LEP and MIP SCNA data from Lung-Broad (Supplementary Figure 9B), Lung-OncoSG (Supplementary Figure 9C, left) and TCGA-LUAD (Supplementary Figure 9D, left) cohorts. Concerning the fact that SCNA is highly related to the consequent gene expression alternation, we calculated the expressional SCC between the 5 group-specific genes on cohorts with available transcriptomic data. Unsurprisingly, when compared to all samples (Supplementary Figure 9C-D, middle), the expressional associations between PTP4A3, NAPRT and RECQL4 transformed to a higher synergetic state for LEP/MIP samples in both Lung-OncoSG (Supplementary Figure 9C, right) and TCGA-LUAD (Supplementary Figure 9D, right) cohort. More explicitly, the correlation between SCNA and RNA expression was higher for the three genes in two public cohorts (Figure 4E) and NAPRT as well as PTP4A3 exhibited significantly higher LEP/MIP group-specificity. As an exemplification, the correlation between PTP4A3 gene’s SCNA and RNA expression increased from 0.372 to 0.693 when narrowing to only LEP/MIP samples in TCGA-LUAD cohort (Supplementary Figure 9E-F). Complementarily, all three co-amplified genes were significantly overexpressed in tumor samples (Supplementary Figure 9G-I, left) and possessed higher expression levels in MIP subtype (Supplementary Figure 9G-I, right) in TCGA-LUAD cohort. Patients with elevated expressional level of the 3 co-amplified genes unsurprisingly got worse overall survival (OS) in Lung-OncoSG (Supplementary Figure 10A) and TCGA-LUAD (Supplementary Figure 10C) cohorts and the discrepancy exacerbated in LEP/MIP samples relatively in two cohorts (Supplementary Figure 10B,D). Complementally, patients with increased SCNA level on the co-amplified genes exhibited similar trend as the expressional stratification strategy, both for all LUAD samples (Supplementary Figure 10E) and LEP/MIP subset (Supplementary Figure 10F) in TCGA-LUAD cohort. To conclude, our comprehensive analyses identified PTP4A3, NAPRT and RECQL4 as the possible novel therapeutic and diagnostic targets for MIP histological subtype, which were co-amplified and co-expressed specifically in LEP/MIP components and expressionally upregulated in MIP group.
Immune-related analyses uncovered elevated immunosuppression in MIP subtype
In connection with the 3 co-amplified genes and the possible exacerbated immunosuppression in MIP subtype, we quantified the immune infiltration in the tumor microenvironment (TME) of three public LUAD cohorts to investigate the possible roles of these genes. For Lung-OncoSG cohort, some differences were observed but the discrepancy did not reach statistical significance (Supplementary Figure 11A). We next correlated the SCNA level of 3 co-amplified genes with the immune cell infiltration levels. By only selecting correlations with P-value<0.01, entries including “Granulocyte-monocyte progenitor”, “immune score” and “microenvironment score” were found negatively correlated (Supplementary Figure 11B), all indicated a reduced infiltration of general immune and stromal cells in MIP subtype. Similarly, when selecting correlations between RNA expression and immune infiltrations with P-value<0.01, PTP4A3 was negatively correlated with CD4+ T memory cell infiltration while NAPRT expression was negatively associated with cancer-associated fibroblast (CAF) amount (Supplementary Figure 11C-D). As for the GEO dataset, significant between-group differences were observed on 7 cell types (Supplementary Figure 12A). Levels of hematopoietic stem cell (HSC) was elevated in relatively aggressive subtype (MIP/SOL) with adjusted P-value<0.01 while quantities of myeloid dendritic cell activated and mast cell decreased in aggressive subtypes. Naïve B cell, neutrophil and T cell CD4+ T-helper 1 cell infiltration were further found associated with NAPRT and RECQL4 expressions (Supplementary Figure 12B). Similar with Lung-OncoSG cohort, the dissimilarities between LEP and MIP subtype were not statistically significant in TCGA-LUAD cohort (Supplementary Figure 13A). Similar cell types were also observed in the correlational analysis on SCNA-level (Supplementary Figure 13B) and RNA-level (Supplementary Figure 13C) data.
Moreover, inspection on the disparity of cancer immunity cycle activity was conducted on Lung-OncoSG, GEO and TCGA-LUAD cohorts. Activities of recurrent steps including release of cancer cell antigen, CD8+ T cell recruiting, dendritic cell recruiting, macrophage recruiting, T-helper 17 (Th17) cell recruiting, T cell infiltration into tumors and killing of cancer cells were significantly higher in MIP subtype (Supplementary Figure 14A-C). Interestingly, the expressions of PD-1 and PD-L1 were generally higher in MIP group in Lung-OncoSG (Supplementary Figure 15A, D), GEO (Supplementary Figure 15B, E) and TCGA-LUAD (Supplementary Figure 15C, F) datasets. By further examining the differential expressed proteins between LEP and MIP subtype from TCGA-LUAD, the identified proteins with MIP-specific elevation (Supplementary Figure 15G) were significantly enriched in PD-L1 and PD-1 checkpoint pathway in cancer (Supplementary Figure 15H) and the protein-level expression of PD-L1 was positively correlated with the transcriptional expression sum of co-amplified genes PTP4A3, NAPRT and RECQL4 (Supplementary Figure 15I). Interestingly, the known associations between the co-amplified genes and chemotherapy resistance was previously reported(35). To summarize briefly, our analyses confirmed the immunosuppression prevalence in MIP subtype and provided therapeutic suggestions for MIP-predominant LUAD patients. Shaped by the higher T cell infiltration level, possible chemotherapy resistance and immunosuppression, immune checkpoint inhibitor treatment could maximize the therapeutic benefit for MIP-predominant LUAD patients.