1. Global cellular landscape in multiple primary lung cancer.
To elucidate the cellular development in multiple primary lung cancer (MPLC), we analyzed a total of 167397 cells from 23 sampling sites in 6 patients with multiple foci of early-stage lung cancer from the National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital (Fig. 1a-b, Supplementary Fig. 1a, d and Supplementary Table 2). Among these cells, 62714 (23.2%) were from AIS/MIA, 65844 (37.5%) were from IAC, and 38839 (39.3%) were from Normal Lung. To increase the accuracy of cell type designation, we jointly applied principal component analysis (PCA) after correction for read depth and mitochondrial read counts. Through marker-based annotation, 8 major cell types were identified in these unbatched and comparable datasets by graph-based UMAP (Fig. 1b, Supplementary Fig. 1a, 1d): epithelial cells, alveolar cells, fibroblasts, endothelial cells, myeloid cells, T lymphocytes cells, B lymphocytes cells and mast cells. Global cell type annotations for MPLC patients were categorized according to the molecular subtype of lung tumor tissues. The marker-based annotation correlated well with the cell types obtained via Cell Marker and SingleR v.1.0 20,21 (Fig. 1b and Supplementary Fig. 1c). T cells and epithelial cells show great differences over the course of development of MPLC. The relative abundance of B cells increased, while that of T cells decreased, from Normal Lung to AIS/MIA and IAC (Fig. 1d, Supplementary Fig. 1b, and Supplementary 1c). The most frequently mutated genes were EGFR (53%) and RBM10 (35%) (Fig. 1e and Supplementary Fig. 1g), and the most frequent variant type was missense mutation (Supplementary Fig. 1g). C > T and C > A transversions were enriched in these patients (Supplementary Fig. 1f and Supplementary Fig. 1g). The correlation of the variant allele frequency (VAF) for each individual focus was not significant (Fig. 1f). The significant mutations and the correlation of the VAF show that the foci in MPLC are independent and do not share an origin.
2. Signatures and metabolic disturbances of malignant cells in MPLC.
A total of 13176 normal epithelial cells and 14553 malignant epithelial cells were obtained from Normal Lung, AIS/MIA and IAC samples, and large-scale copy number variations (CNVs) were inferred with Normal Lung cells as references22–24. Malignant epithelial cells were identified by inferring multipatient copy number variations (CNVs) with normal epithelial cells (mainly immune cells, including T cells and B cells) as references. The CNV patterns showed that malignant cells formed clusters in patients with different origins, indicating a high degree of intertumoral heterogeneity (Fig. 2a, Supplementary Fig. 2c, 2e). The clusters of malignant cells from different foci were independent and rarely overlapped, while the clusters of normal cells were mixed (Fig. 2a, Supplementary Fig. 2a, 2c). Alveolar cells are epithelial cells in the lung and play a pivotal role in maintaining the integrity and function of the alveoli. Both alveolar type I cells and alveolar type II cells were positive for EPCAM gene expression 25,26. Other EPCAM + cells, including club, ciliated, basal, goblet, mucous, serous, ionocyte, and neuroendocrine cells, were categorized as epithelial cells. We identified them according to the expression of canonical gene markers (Fig. 2b). We clustered the normal epithelial cells as alveolar type I cells (AT1; AGER+), alveolar type II cells (AT2; SFTPA1+), ciliated airway epithelial cells (ciliated; TPPP3+), secretory club cells (club; SCGB1A1+) and other cells (no canonical markers to classify) based on the previously described markers (Fig. 2a, Supplementary Fig. 2a, and Supplementary Fig. 2d) 17,27, independent of patient origin. The relative abundance of malignant epithelial cells, i.e., Ciliated, Club and AT1 cells, was increased in malignant cells compared with Normal Lung cells, and the number of AT2 cells was decreased from Normal Lung to AIS/MIA and IAC. We used lung cancer cells to recapitulate the multilineage differentiation processes of normal epithelia, and subclustering analysis of all epithelial cell datasets revealed divergent differentiation states in MPLC (Fig. 2d). Intriguingly, the projection of tissue cells and pathological feature classification onto the trajectory assigned the upper right branch and the lower right branch as malignant cells, mainly from IAC tissue. The trajectory indicated that there are two states in the evolution of malignant cells from AIS/MIA to IACs (Fig. 2c). MSI showed slight changes along the total pseudotime trajectory in MPLC, but TMB did not. In the pseudotime model, this trajectory began from nonmalignant AIS/MIA cells. Then, the trail branched in two directions: some malignant cells became state 1, and some malignant cells became state 2 in IAC. Branch 1 exclusively expressed immune response markers (e.g., CEACAM6, CXCL14, HLA_DPB1, CD74, S100A14, SPINK5, etc.), while branch 2 expressed cell death markers (e.g., PDCD4, AKR1C1, IER3, RGCC, APOH, etc.) (Supplementary Fig. 2g). Functional gene set enrichment analysis showed that the transition from State 1 was involved in the immune response, immune system process and response to transforming growth factor (Fig. 2e). In addition, the expression of genes implicated in cell death, response to stress and cell cycle increased in State 2 (Fig. 2f), which shows that some malignant cells will respond to stress from the immune response, some will undergo programmed cell death, and some will enter the cell cycle due to selective pressure during the development of tumors in MPLC.
3. Clonal dynamics and phenotypic transitions of tumor-infiltrating T cells in MPLC.
A total of 81,971 T cells were obtained from all samples and clustered into 10 subtypes. Three subtypes of CD4 + T cells were further clustered as Treg cells (TIGIT+, ENTPD1+, FOXP3+), Trm cells (CXCR6+) and CD4 + naïve T cells (CCR7+, IL7R+). Three subtypes of CD8 + T cells were further clustered as activated T cells/exhausted T cells (TNF+, IFNG+, GZMA+, GZMB+, KLRD1+), activated/effector T cells (TNF+, GZMA+, EOMES+), and effector T/Trm cells (ITGAE+, CXCR6+, GZMA+, GZMB+). One subtype of both CD4- and CD8-positive T cells was further clustered as naïve/memory T cells (IL7R+), and 2 subtypes with no CD4 or CD8 expression were further clustered as NK cells/exhausted T cells (HAVCR2+, IFNG+, GZMB+) and cytotoxic T cells/NK cells (XCL1+, XCL2+) (Fig. 3a, Supplementary Fig. 3a) based on canonical markers (Fig. 3b,3c, Supplementary Fig. 3b), as described previously 28–32. The relative abundances of classified naïve T cells, Treg cells and Trm cells were increased, while those of activated/exhausted T cells were decreased, from Normal Lung to AIS/MIA and IAC (Fig. 3d). We next analyzed the lineage relationships between phenotypes and clonotypes in T cells. Globally, we found that cells in normal and IAC samples shared more common clonotypes in TCRs than AIS/MIA-group cells (Fig. 3e, Supplementary Fig. 4a). To investigate whether the development of tumors within clones could inform lineage transitions between T-cell phenotypes, we aggregated all clonotypes and measured the fraction shared in different clusters. There was minimal significant clonotype sharing between T cells in different cell groups. Trm and naïve T cells shared more significant clonotypes with other T cells, and the shared clonotypes increased with the development of infiltration in the tumor tissue, suggesting an increasing transition from Trm and naïve T cells to other activated T-cell states (Supplementary Fig. 4a). The T-cell phenotypes in AIS/MIA samples include more clonotype 1, for which TRB-TRA is “CASSGLAAKPGELF-CAVRRGQNFV”, than other normal and IAC samples (Supplementary Fig. 4b). This sequence may play an important role in antigen presentation during tumor invasion. In normal tissue, there is not enough antigen from tumors to stimulate T cells, and the expression of this antigen presentation factor is low. In IAC tissue, the tumor has experienced stress from T cells in AIS/MIA, and new subclone evolution may not cause the same antigen stimulation. Therefore, clonotype 1, in which TRB-TRA is “CASSGLAAKPGELF-CAVRRGQNFV”, was expressed significantly only in AIS/MIA samples. The TCR clonotype shared more overlap in focus grouped according to the source of the patient compared to those grouped by anatomical pathology info and still had more correlation (Fig. 3f).
4. Trajectory profiling reveals the branched progression of T cells in MPLC.
We next used Monocle3 to analyze the CD8 + and CD4 + T-cell trajectories to further investigate the potential transitions in each of these cell types. The pseudotime trajectory axis derived from Monocle3 indicates that CD8 + naïve and memory T cells transdifferentiate into effector T cells first. Then, some of these cells become exhausted T cells, and others transform into cytotoxic T cells (Fig. 4a). We compared the expression of genes in the next-stage cells with those in the previous stage (Supplementary Fig. 3c, 3d). Cells on the memory T to effector T branch expressed immune response markers (e.g., TNFRSF14, NFE2L2, TNFRSF1B, IL2RG, etc.) (Supplementary Fig. 3c,3d), while naïve T cells transitioning to effector T cells expressed lymphocyte activation and T-cell differentiation markers (e.g., FOXP1, TGFBR2, TNFRSF4, PRDM1, etc.) (Supplementary Fig. 3c, 3d). Cells on the effector T to exhausted T branch expressed responses to stress and cell death markers (e.g., GNLY, CCL4, and FCGR3A) (Supplementary Fig. 3c, 3d), while effector T cells expressed cytokine production markers (e.g., CXCR4, CRTAM, etc.) (Supplementary Fig. 3c, 3d). Gene Ontology enrichment analysis confirmed that the transition from memory T cells to effector T cells was mainly implicated in the immune response, immune system process and immune response regulation. The transformation from memory to effector in T cells is mainly a process of immune response and regulation (Fig. 4b (left-upper)). The function of the geneset in the transformation from naïve T to effector T cells was mainly related to lymphocyte activation, T-cell and leukocyte differentiation, T-cell activation and immune system development, which showed that the transformation from naïve to effector T cells was mainly a process of leukocyte and T-cell differentiation (Fig. 4b (upper right)). Gene Ontology enrichment analysis confirmed that signature genes were enriched for response to stress, positive regulation of cell death and response to stimulus pathway during effector T-cell transformation to exhausted T cells (Fig. 4b (lower right)). The functional gene set enrichment analysis from effector to cytotoxic T cells showed that the transition was mainly implicated in the regulation of cytokine production, which included the positive regulation of myeloid leukocyte cytokine production involved in the immune response, immune effector processes, macrophage cytokine production and the positive regulation of cytokine production involved in the immune response (Fig. 4b (lower left)). The pseudotime trajectory axis indicates that CD4 + naïve and Treg cells have four final differentiation states in themselves, which are labeled as Naïve_C1 to Naïve_C4 and Treg_C1 to Treg_C4 (Fig. 4c, Supplementary Fig. 5a). Pathway analyses showed that Naive_C2 cells were enriched in some effector pathways [Hallmark pi3k_akt_mtor signaling, chemokine signaling pathway, and natural killer cell-mediated cytotoxicity], while Naive_C4 cells did not express some pathways [T-cell receptor signaling pathway, regulation of lymphocyte differentiation, regulation of T-cell differentiation, T-cell differentiation, T-cell proliferation, and T-cell activation]. The function of the T cells in the Naïve_C2 cluster is closer to that of effector or exhausted T cells, while the function of the Naïve_C4 cluster is more closely related to that of other naïve clusters (supplemental Fig. 5b (Left)). The GSVA results showed that Treg_C3 cells were enriched in the regulation of myeloid cell differentiation [myeloid leukocyte differentiation, myeloid cell differentiation, and regulation of myeloid cell differentiation], while Treg_C2 cells mainly participated in the regulation of cell differentiation, activation of the T-cell receptor signaling pathway [regulation of cell differentiation, T-cell receptor signaling pathway, T-cell receptor signaling pathway and T-cell activation]. Treg_C2 cells also had high expression levels of genes related to lymphocyte differentiation, including lymphocyte differentiation, regulation of T-cell differentiation, and regulation of lymphocyte differentiation. Treg_C1 cells mainly participates in the regulation of stem cell differentiation [regulation of stem cell differentiation, stem cell differentiation, and hematopoietic progenitor cell differentiation] (Supplementary Fig. 5b (Right)).
5. B cells have two functional distribution subtypes in MPLC.
A total of 4279 B cells were analyzed and classified into five subtypes (Fig. 5a, Supplementary Fig. 6a, b, supplementary Table 3): Bcells_C1 (MS4A1+, CD19+, CD74+, CD37+, BTG+) (49.7% in IAC, 10.3% in AIS/MIA, 6.4% in Normal); Bcells_C2 (NEIL1+, RGS13+) (0.9% in IAC, 0.4 in AIS/MIA, 0.0% in Normal); Bcells_C3 (CYTOR+, NUCB2+, ANKRD28+, TRIB1+) (7.8% in IAC, 2.9% in AIS/MIA, 1.9% in Normal); Bcells_C4 (6.2% in IAC, 3.8% in AIS/MIA, 0.9% in Normal); and Bcells_C5 (IL7R+, IL32+, CST7+, FYB1+) (5.9% in IAC, 1.5% in AIS/MIA, 1.4% in Normal) (Supplementary Fig. 6e, Supplementary Fig. 7a). The relative abundance of the B-cell subtype showed no uniform pattern in different patient foci and or when grouped by anatomical pathology (Fig. 5c, Supplementary Fig. 6c, d). The pathway analysis of B cells in MPLC further confirmed that there are two kinds of B cells with different functions. The regulation and activation of the innate immune response was highly activated in the Bcell-C4 and Bcell-C5 clusters (Fig. 5d, red pathway). However, the Bcell-C1, Bcell-C2, and Bcell-C3 clusters were enriched in pathways related to antigen binding, processing and presentation, including Reactome MHC class II Antigen Presentation (Fig. 5d, green pathway). Some B cells are involved in immune activation and regulation of the immune response, while others are involved in antigen presentation. We next used Monocle3 to analyze the trajectory performance of B cells to further investigate the potential transition in each of their cell types. The pseudotime trajectory axis derived from Monocle3 indicates that Bcells_C1 (MS4A1+, CD19+, CD74+, CD37+, BTG+) have two differentiation states; one is transdifferentiating into Bcell_C3 (NEIL1+, RGS13+) and Bcell_C5 (IL7R+, IL32+, CST7+, FYB1+), and the other is transitioning to Bcell_C4 (Fig. 5b, Supplementary Fig. 7b, 7c, 7d). Combined with the pathway enrichment results, these findings show that B cells mainly differentiated into two different types of cells with different functions in MPLC; one is involved in immune activation and regulation of the immune response, while the other is involved in antigen presentation. Different from the TCR clonotypes, which shared more overlap and correlation in foci grouped according to the source within the patient, the BCR clonotypes shared little overlap or correlation in MPLC (Supplementary Fig. 6f).
6. Myeloid cells have two functional distribution subtypes in MPLC.
Subclustering of 46836 myeloid cells revealed 10 subsets (Fig. 5e, Supplementary Fig. 9a.): M1 macrophages (MSR1+, MARCO+, etc.), M2 macrophages (MERTK+, MMP9+, etc.), MDSCs (CEACAM8+, ITGAM+, etc.), monocyte cells (TNFSF10+, FCER2+, CLEC10A+, etc.), dendritic cells (FCER1A+, CD1C+, CLEC4A+, etc.), myeloid dendritic cells (CCR7+, IL15+, CD1C+, etc.), plasma cells (IRF4+, CD36+, CXCR4+, etc.), secretory cells (CXCL10+) (Fig. 5g, Supplementary Fig. 9d). The percentage of M2 macrophages increased from normal to AIS/MIA to IAC, while the other subtypes were not significantly different in MIA/AIS and IAC (Fig. 5f, Supplementary Fig. 9b). Macrophages play an important role in maintaining organismal integrity by either directly participating in pathogen elimination or repairing tissue under sterile inflammatory conditions33, and Microphage_2 cells support tissue integrity by providing growth factors and healing capacity34. From a functional perspective, myeloid cells are divided into two main types: one exhibits high expression of IL1A, FCGR3A and IFN-γ (M1 macrophages), which can positively regulate the innate immune response. The other negatively regulates the adaptive immune response and regulation of stem cell differentiation (M2 macrophages). The increase in M2 macrophages will lead to malignant cells adapting to the immune response and escaping the response from the immune system (Supplementary Fig. 9c). The pseudotime trajectory axis derived from Monocle3 indicates that myeloid cells have four main branch trajectories (Fig. 5h). M2 macrophages were mainly concentrated in the left branch trajectory, M1 macrophages were mainly in the opposite branch trajectory, monocyte cells were in the middle to left stage, secretory cells were mainly concentrated in the lower left branch trajectory, and other Myeloid2_cells were mainly in the lower right branch trajectory (Fig. 5h). Next, we evaluated the expression of DEGs in the M1 macrophage (MSR1+; MARCO+) and M2 macrophage (MERTK+; MMP9+) populations in the whole trajectory. The expression of macrophage colony-stimulating factor genes, such as CSF1, MCEMP1, APOE, CD52 and FABP4, increased gradually during the process of transdifferentiating into M1 macrophages, and the expression of genes related to cytomembrane transport, namely, GPR183, CD48, CD93, and CCL5, increased during the process of transdifferentiating into M2 macrophages (Supplementary Fig. 9f).
7. Positive regulation of endothelial cell development is extensive in IAC.
We found four distinct subtypes by reclustering 2081 endothelial cells (Fig. 6a, Supplementary Fig. 8a): Endothelial_C1 (VIPR1+, IL1RL1+, TMEM100+, TNFSF10+, etc.), Endothelial_C2 (EMP1+, CD93+, PLVAP+, etc.), Endothelial_C3 (TNFAIP3+, CXCR4+, CREM+, etc.), and Endothelial_C4 (TFF3+, TIMP2+, STMN1+, etc.) (Fig. 6c, Supplementary Fig. 8d). The percentage of Endothelial_C1 and Endothelial_C2 in MIA/AIS and IAC were more different than those in normal samples, and the distribution of endothelial subtypes was not significantly different in MIA/AIS and IAC (Fig. 6b, Supplemental Fig. 8c). Then, we focused on the transcriptomic features and functions of Endothelial_C2 cells. The genes EMP1, TSPAN7, VWF, PLVAP and CD93 were expressed at high levels in these cells, and pathway analyses showed that Endothelial_C2 cells were enriched in positive regulation of endothelial cell development pathways [positive regulation of blood vessel diameter, regulation of endothelial cell differentiation pathway, and abnormal morphology of the great vessels] (supplemental Fig. 8d, Fig. 6d). Thus, Endothelial_C2 cells are related to the regulation of endothelial development, and their prevalence increases with the development of tumor invasion.
8. Fibroblast subtype distributions are similar in IAC, AIS/MIA and normal samples.
Fibroblast cells maintain a very stable state in MPLC (Fig. 6c, Supplementary Fig. 8c). A total of 2523 fibroblast cells were analyzed, and five subtypes were identified (Fig. 6c). In MPLC, there was no significant difference in the number of different fibroblast clusters, and there were also no significant changes in the pathway and function of genes expressed in different fibroblast clusters (Fig. 6f, 6g, Supplementary Fig. 8f). Fibroblast_C1 cells expressed high levels of genes associated with the Reactome elastic fiber formation annotation, such as COL3A1, COL1A1, COL6A3, COL10A1, and COL8A1, whereas Fibroblast_C5 cells highly expressed genes related to positive regulation of fibroblast proliferation, like RASD1, SFTPB and FOS (Fig. 6g, Supplementary Fig. 8g).
9. Characterization of cell–cell interactions involved in MPLC.
There are many specific signaling network changes in MPLC, which depend on complex interactions between structural and inflammatory cells35. One cell expresses a receptor or ligand, and this “ligand–receptor” pair will be defined as an incoming or outgoing interaction for this cell. We used CellPhoneDB to identify potential cell–cell interactions and define their changes in primary lung cancer. Compared with the MIA samples, the cells in tumors exhibited significantly increased interactions (Fig. 7a). Alveolar and myeloid cells had the most outgoing interactions of all cell types in both MIA and IAC samples. In comparison with MIA, the interaction between T cells and alveolar cells was significantly increased (Fig. 7b). Myeloid, cytotoxic T, natural killer and B cells are involved in the action of recognizing and presenting the antigens of epithelial cells among the interactions from immune cells to epithelial cells. Analysis of the biological functions revealed that interactions related to antigen recognition and presentation (CD74_COPA, HLA-C_FAM3C, and HLA-DRB1_OGN) were less abundant in IAC than in MIA and normal samples, which means that some escape of immune recognition occurred with deeper invasion in MPLC (Fig. 7c upper). The interactions related to independent cellular adhesion (NECTIN1_CADM3, FGF1_FGFR4, and CD44_FGFR2) were still less abundant in epithelial cells than in fibroblasts, which means that epithelial cells promoted structural changes in fibroblasts and reduced cell adhesion. The interactions regulating the growth and differentiation of numerous types of cells (ACVR1_BMPR2_BMP6 and ACVR1_BMPR2_BMP5) were more abundant (Fig. 7c, middle). On the one hand, epithelial cells reduce fibroblast adhesion; on the other hand, they strengthen the differentiation of other mesenchymal cells with deeper invasion in MPLC. The interactions related to endothelial and hematopoietic stem cell differentiation and development (SELL-CD34, SELP-CD34, NPR1-NPPC, and NPR2-NPPC) were more abundant in IAC than in normal and MIA samples (Fig. 7c, down).