InterCellar highlights data-driven patterns of cell-cell communication in scRNA-seq data
In order to demonstrate InterCellar’s functionalities implemented in the three biological domains (so-called, universes), we considered a publicly available COVID-19 dataset from Chua et al.6, containing nasopharyngeal and bronchial samples from 19 patients and from five healthy controls. In particular, we adopted the clinical classification of patients provided by the study, consisting of 8 moderate cases and 11 critical cases. Moreover, we retained the original cell type labeling performed by the authors and ran CellPhoneDB (v2.0)11 to obtain predicted cell-cell interactions as input to InterCellar. Thus, three separate input datasets were generated, corresponding to control, moderate and critical cases. Cell types can be grouped in (a) epithelial cells, namely basal, ciliated, ciliated-differentiating (ciliated-diff), FOXN4+, ionocyte, IFNG responsive cell (IRC), secretory, secretory-differentiating (secretory-diff) and squamous and (b) immune cells, namely B cell, cytotoxic T cell (CTL), regulatory T cell (Treg), natural killer (NK), natural killer T cell (NKT), natural killer T cell-proliferating (NKT-p), neutrophil (Neu), plasmacytoid dendritic cell (pDC), monocyte-derived dendritic cell (moDC), mast cell (MC), resident macrophage (rMa), non-resident macrophage (nrMa) and monocyte-derived macrophage (MoMa).
As the first step, we focused on the analysis of the cluster-verse and considered the total number of interactions per cell type while comparing critical to moderate cases (Fig. 2a). We observed rather small differences in numbers of interactions for cell clusters belonging to epithelial cells (with the exception of ionocytes), while immune cell types showed a higher variability. In particular, we could confirm the findings of Lin et al.7, who described a higher number of interactions for macrophages (MoMa and nrMa) as well as a lower number of interactions for T cells (CTL and Treg), when comparing critical to moderate cases. Moreover, in the same comparison, comprehensive consideration of all cell types highlights a striking gain of interactions for mast cells (MC) as well as proliferating natural killer T cells (NKT-p). This novel finding holds true when examining the number of interactions between a certain cell type of interest and all other cell types (Fig. 2b and Supplementary Fig. 1a,b). Specifically, both immune cells (e.g., MoMa, nrMa, CTL and Treg) and epithelial cells (e.g., ciliated, ciliated-diff, secretory and secretory-diff) show a consistent pattern of communication in critical cases, characterized by a higher number of interactions occurring between all cell types and MC or NKT-p cells. Interestingly, MCs have recently been hypothesized to have a major role in driving hyperinflammation in severe cases of COVID-19, due to their dysfunctional phenotype related to mast cell activation syndrome16,17.
InterCellar unravels the composition of cellular communication by examining ligand-receptor pairs
Another biological domain relevant for cell-cell communication is represented by the proteins (and thus genes, in the case of scRNA-seq) that compose ligand-receptor pairs. InterCellar’s gene-verse offers the possibility to investigate precisely which cluster-pairs communicate through which genes. We call “int-pair/cluster-pair couplet” the occurrence of an int-pair in a certain cluster-pair. These couplets are visually represented by InterCellar in dot plots, where the analyst interactively chooses int-pairs and cluster-pairs to examine. Thus, we proceeded in the analysis of the COVID-19 datasets, by plotting the interaction score (calculated by CellPhoneDB, see Methods) of int-pairs occurring in the communication between secretory cells, secretory-diff and all other cell types. In particular, by looking at int-pairs composed of selected chemokine ligands mentioned in Chua et al. 6(CXCL1, CXCL3, CXCL6, CXCL16, CXCL17), we could observe a clear enrichment of ligand-receptor pairs in the communication between secretory as well as secretory-diff cells and neutrophils (Fig. 3a). Specifically, these int-pairs were detected at varying levels of expression in the two disease conditions, while they were completely absent in control samples. This finding confirms the hypothesis of the authors concerning neutrophil recruitment induced by secretory cells6. However, no relevant difference could be recognized between moderate and critical cases.
To further investigate the latter phenomenon, we made use of InterCellar’s gene-verse functionalities by highlighting only int-pair/cluster-pair couplets that are unique to a certain phenotype (i.e., control, moderate or critical). We considered two groups of chemokine ligands, namely the CC- and CXC- subfamilies and selected all possible int-pairs found in each phenotype. CXCL-pairs showed a predominant enrichment of interactions unique to critical cases, in both epithelial (47%) and immune cells (57%) (Fig. 3b and Supplementary Fig. 2a). Interestingly, due to this unbiased view of all int-pairs and cell clusters, we could notice, in moderate cases, neutrophil recruitment carried out by epithelial cells such as basal, ciliated and FOXN4+ cells, mainly through the pairs CXCL1 & CXCR1 and CXCL1 & CXCR2. On the contrary, critical cases showed enrichment of CXCL3 & CXCR1 and CXCL3 & CXCR2 in the communication from IRC or squamous cells to neutrophils (Fig. 3b). Regarding immune cells, critical cases were characterized by an outgoing cellular communication from moDC, MoMa and NKT-p cells towards multiple other immune cell types. Moderate cases showed a unique pattern of communication promoted by NK cells and directed towards immune as well as epithelial (secretory and ionocyte) cells (Supplementary Fig. 2a). For CCL-pairs, we focused on selected immune cell types, separated into three groups: macrophages (MoMa, nrMa and rMa), NK cells (NK, NKT and NKT-p) and T cells (Treg and CTL) (respectively Fig. 3c and Supplementary Fig. 2b,c). While for macrophages the proportion of unique interactions favors the critical phenotype (56%), an inverse tendency could be noticed for NK (41% critical) and T cells (20% critical). In particular, among macrophages, MoMa and nrMa displayed many chemokine interactions that were unique to critical cases. These interactions involved recipient cell types such as MC, moDC, neutrophils and NKT-p. Specifically, MoMa communicates with these other cell types via CCL4L2 while nrMa via CCL7. At the same time, interactions of rMa via CCL3 in controls are absent in COVID-19 cases (Fig. 3c). On the contrary, NK and NKT as well as Treg and CTL showed an enrichment of interactions unique to moderate cases, directed towards secretory cells (among others) (Supplementary Fig. 2b,c). Altogether, these results are in line with two findings by the authors: on the one hand, critical cases are characterized by a highly inflammatory profile for MoMa and nrMa; on the other hand, a well-balanced immune response distinguishes moderate cases, in which the communication between immune cells and epithelial cells underlie an effective response to the viral infection6.
InterCellar links biological functions and signaling pathways to cellular communication
InterCellar’s function-verse investigates the third biological domain of interest and offers the opportunity to annotate cell-cell communication with biological functions and pathways. We performed functional annotation for each COVID-19 dataset, using all functional databases provided by InterCellar (see Methods). Then, we examined functional terms of interest through a sunburst visualization, which combines all int-pair/cluster-pair couplets that have been annotated to that particular function. To investigate signals of a “cytokine storm”18 in critical COVID-19 cases, as previously suggested by the analysis of chemokines, we selected the functional term “viral protection interaction with cytokine and cytokine receptor” (annotated from KEGG19) and compared the sunburst plots of moderate and critical cases (Fig. 4). As a first observation, we noticed that moderate cases had a higher total number of int-pairs annotated to this function (681 pairs, compared to 553 in critical cases). However, when considering only unique int-pairs, critical cases had the highest number (51 vs. 46). While CTL and Treg are predominant in moderate cases (with the highest total int-pairs, respectively 104 and 101), these two cell types are replaced by MoMa and nrMa in critical cases, occupying third and fourth place in the ranking with almost half of their int-pairs (55 and 54, respectively). Interestingly, control cases are comparable to moderate ones with respect to the ranking of CTL, B cells and Tregs (Supplementary Fig. 3). Furthermore, when examining the specific int-pairs, CTL communicated in moderate cases with nrMa (as well as other cell types) through a cytokine-encoding ligand called LTA (lymphotoxin-alpha), which is completely absent in critical cases. On the other hand, MoMa and nrMa profiles seem to be fairly conserved in the two disease phenotypes, with only a few int-pairs unique to critical cases (e.g., IL18 & IL18 receptor for MoMa::CTL; CCL7 & CCR1, CXCL5 & CXCR1, CXCL5 & CXCR2 for nrMa::Neu). Overall, these results suggest once again a dysfunctional immune response in COVID-19 critical cases, where a balanced lymphocyte-mediated regulation seems to be overthrown by excessive activation of macrophages.
InterCellar guides a focused analysis of cellular communication through the definition of functionally distinct modules
The workflow of InterCellar concludes with the analysis of int-pair modules, defined as groups of functionally similar interactions. In order to demonstrate this last step, as well as the general usability of InterCellar, we considered a second scRNA-seq dataset, composed of metastatic melanoma samples from Tirosh et al.20. This dataset includes both malignant cells and cells from the tumor microenvironment (TME) of 19 patients. We retained the cell type labels assigned by the authors, namely melanoma-malignant cells, T cells, B cells, macrophages (Macro), endothelial cells (Endo), cancer-associated fibroblasts (CAF) and natural killer (NK) cells. Once again, we ran CellPhoneDB (v2.0)11 to obtain ligand-receptor interactions as input to InterCellar.
Since cancer cells have been described as actively recruiting and reprogramming normal cells of the tumor ecosystem21, we used InterCellar to perform an in-depth analysis of the interactions that characterize malignant cells in their communication with the TME. Thus, we chose the malignant cell cluster as viewpoint in the analysis and focused on directed-outgoing interactions as those int-pairs where the ligand is expressed (and sent) by malignant cells to receptors expressed on all other cell types. Briefly, InterCellar’s definition of int-pair modules is based on the functional annotation performed in the function-verse: each module is composed of a subset of int-pairs that share common functional patterns. These modules can be visualized in a two-dimensional plot as a UMAP22, which represents int-pairs clustered by functional similarity (Fig. 5a, see Methods). Moreover, for each module, InterCellar displays a circle plot to identify precisely which cluster-pairs and genes are involved (Fig. 5b-e). Hence, a total number of nine int-pair modules was defined for the selected interactions and, for each module, we could associate functional terms found to be statistically significant (using a one-sided permutation test, see Methods). Interestingly, by examining the circle plots, the relevance of many int-pairs could be validated by further literature research. For example, we investigated the underlying interactions for modules #1, #3, #4 and #7. Int-pair module #1, characterized by the functional terms “extracellular matrix (ECM) organization” and “integrin signaling pathways”, is composed of collagen-encoding genes expressed by malignant cells, which interact with integrin-complexes expressed by endothelial cells, CAF, B cells and T cells, as well as with malignant cells themselves. Notably, the importance of collagen-integrin interactions in promoting cell invasiveness has been already described in the literature by Zhou et al.23. EPH-Ephrin signaling, which distinguishes, among others, int-pair module #3, has been associated with tumor neovascularization24 and is mainly promoted by interactions between malignant cells and endothelial cells or CAFs. Int-pair module #4 shows greater heterogeneity and comprises interactions associated with vascular endothelial growth factors (VEGFA and VEGFB), platelet-derived growth factors (PDGFA) as well as fibroblast growth factor (FGF5) that are enriched to a higher extent between malignant cells and CAFs, compared to the other cell types. Notably, the role of malignant cells in promoting CAFs proliferation has been previously investigated by Izar et al.25. However, the authors focused on the interaction between transforming growth factor-alpha (TGF-alpha) and epidermal growth factor receptor (EGFR), which was not found in this data. Finally, mechanisms of metastatic invasiveness in melanoma have been associated with the overexpression of annexin A1 in malignant cells26. Interactions involving this gene can be seen in the circle plot of int-pair module #7: ANXA1, sent by malignant cells, interacts with formyl peptide receptor (FPR) exclusively expressed by macrophages, suggesting a deleterious cross-talk between these two cell types.