A Network-based approach identifies common and cell-type specific pathways of the SARS-CoV-2-induced host response in the human airway epithelium
To identify candidate regulators responsible of viral-induced hijacking of the host machinery and to investigate their corresponding biological functions, organotypic air-liquid interface (ALI) cultures from adult human extrapulmonary airways (lower trachea and main bronchi) were exposed to SARS-CoV-2 (MOI 0.1, n = 3, Fig. 1a) or mock conditions (see Methods) after being cultured for 21 days. Control and infected cultures were then analyzed at 1, 3, and 6 days post-infection (dpi). Infection efficiency was confirmed by plaque assays and by the identification of nucleocapsid (NP) signals in immunofluorescence (IF) assays (Fig. 1a).
Single-cell RNA-Seq (scRNA-Seq) profiles were generated and analyzed to identify proteins controlling the programs hijacked by the virus in basal, ciliated, and secretory cells, as well as to assess their effect on cell function. Quality control filtering (Suppl. Table 1) revealed minimal batch effect, with cells clustering according to the expected epithelial phenotypes, namely basal, secretory and ciliated cells, as assessed by analysis of established lineage markers and further validated via SingleR analysis33 using previously reported single-cell profiles.34 (Fig. 1b,c; Suppl. Figure 1a-c) .
Although the total number of epithelial cells was not significantly changed in the SARS-CoV-2-exposed and mock control cultures, the relative proportion of cell types differed over time in culture. The percentage of basal cells increased while that of ciliated cells decreased both at 3dpi and 6dpi, compared to controls (Fig. 1d, p ≤ 2.2×10− 16, by Chi-Square test). Infection status was determined by alignment of quality-controlled cells against the SARS-Cov-2 genome confirmed by at least one read per cell mapped to the viral genome. Consistent with the single-stranded (ss) RNA results, the viral genome alignment analysis showed a progressive increase in the proportion of infected cells vs. non-infected cells at 3 and 6 days post-exposure (Fig. 1e). This also revealed N and Orf10 among the most expressed viral genes in infected cells, across all cell types (Fig. 1f). In contrast to prior reports34–36, no statistically significant differences in infection rates were detected across these three cell populations (Suppl. Table 2), a trend that was independent of the specific threshold of the mapped reads to the SARS-CoV-2 genome used for the identification of infected cells (Suppl. Figure 1d).
Next, we investigated the cell type-specific host response of ciliated, basal and secretory cells by analyzing the differential activity of all regulatory proteins, as assessed by metaVIPER11, the extension of VIPER to single-cell analyses. To generate sufficiently informative host response signatures for metaVIPER analysis, read counts were uniformly normalized using a metaCell approach (Fig. 2a see methods). MetaVIPER (henceforth VIPER for simplicity) was then used to transform the differential gene expression signatures of individual SARS-CoV-2-infected metaCells at 3 and 6 dpi—vs. metaCells from either mock controls or non-infected cells at the same time point—into differential protein activity profiles. For this purpose, we generated a context-specific regulatory network by analyzing a publicly available repository of primary human airway epithelial gene expression profiles37,38 with the ARACNe algorithm10,11. The results of this analysis was a subpopulation-specific repertoire of proteins that were aberrantly activated or inactivated in response to SARS-CoV-2 infection in basal, ciliated or secretory cells (Fig. 2b, Suppl. Figure 2a).
The rationale for comparing infected cells to either mock or non-infected cells is critical to deconvoluting unspecific/indirect effects (cytokine secretion by infected cells affecting the non-infected bystanders) from the specific/direct effects (the reprogramming unique to the infected cells, where the unspecific effects are subtracted). As such, we define the signature derived from infected vs. mock cells as the Unspecific Infection Signature (UIS) and the signature derived from infected vs. non-infected bystander cells as the Specific Infection Signature (SIS).
As expected, hallmark39 pathway analysis of the UIS signature revealed interferon alpha/gamma signaling, protein secretion, complement, and IL6/JAK/STAT3 signaling as the most significantly activated pathways in all cell types (Fig. 2c, Fig. S2b). Indeed, the 50 most differentially activated proteins included interferon-induced factors such as IFITM1—a known SARS-CoV-2 infection cofactor in the human lung40—IFI6, IFI27, MX1, IRF9, STAT1, STAT2, as well as a number of others proteins (ZNFX1, PLSCR1, SP110, PARP14, TNFSF10, CASP1, OAS3) that were prominently activated (p ≤ 0.001) in the host response signature (Fig. 2b-c, Suppl. Figure 2b, Suppl. Table 3a)41.
When normalized enrichment scores (NES) were averaged over the three cell types at days 3 and 6, pathway enrichment analysis of the UIS signature—either based on differentially expressed genes or differentially active proteins—was highly concordant. Statistical significance of the concordance was assessed by Spearman correlation (ρ = 0.641, p = 8.2×10− 6) and F-statistic (F = 62.0, p = 1.7×10− 9). Results were also consistent with previous reports of significant innate immune response activation by SARS-CoV-2 in contexts as diverse as human airway-derived lung cancer cells (calu-3), gastrointestinal organoids of different origins30, and cultured human bronchial epithelial cell (HBEC) ALI cultures34,39,40 (Suppl. Figure 2c).
Moving from pathway analysis to analysis of individual proteins, besides IFNs and related factors, the UIS also identified the zinc-finger protein ZNFX1, as a top activated MR in the three cell types analyzed, particularly at 3dpi (Fig. 2b left). Interestingly, activation of ZNFX1 in immune cells isolated from bronchoalveolar lavage of COVID-19 patients has been identified as an important component of the antiviral response41. Mechanistically, these studies show that ZNF proteins could directly recognize and bind to CpG sites in SARS-CoV-2 to induce IFN in infected cells. Thus, our findings of similar ZNFX1 activation in a system that consists solely of epithelial cells is noteworthy, suggesting that the ZNF-IFN association may occur in a broader cellular context. However, our MR analysis also revealed other ZNF family members among the top inactivated proteins in specific cell types. For example, ZNF491, ZNF157, ZNF19 emerged from the UIS as significantly inactivated in ciliated cells compared to secretory or basal cells (p ≤ 0.01, by t-test) (Fig. 2b right). Whether the differential inactivation of ZNF family members selectively in specific cell types influences the susceptibility to viral infection, remains to be investigated.
There is also increasing evidence that viruses may hijack specific ribosomal proteins to achieve optimal viral translation42. While broadly confirming these findings, our analysis also suggest that SARS-CoV-2 may selectively target the translational machinery in distinct cell types. For instance, we found specific ribosomal subunits (RPS6, RPS3, and RPL7)43, and translation-related proteins (NACA44, YBX145, and PRMT146) to be significantly inactivated only in secretory and basal cells (p < 0.01 by t-test)(Fig. 2b right).
Analysis of infected vs bystander signatures reveal cell type-specific SARS-CoV-2 host response MRs.
Previous studies show that IFN signaling from SARS-CoV-2 infected cells indirectly affects their immediate neighbor cells, thus contributing to a broad IFN signature activation across the entire airway epithelium34. To account for potential indirect, paracrine effects, such as those induced by IFN signaling, we reanalyzed our data, now comparing infected (≥ 1 viral RNA reads) with non-infected bystander cells (with no viral RNA reads) in SARS-CoV-2-treated cultures, as captured by the Specific Infection Signature (SIS). Specifically, we reasoned that analysis of the SIS signature could identify additional, more specific host cell response mechanisms that otherwise could not be revealed by the UIS, in which infected and mock cells were analyzed without bystander cell contribution.
For this, we focused on 3dpi to prevent inclusion of secondary effects (Fig. 3a). As expected, SIS analysis no longer highlighted “immune response” as differentially activated (Fig. 3b-c to 2b-c). Instead SIS identified MRs activated specifically in the infected cells, in some cases also revealing cell-type specificity. For instance, hallmark features of SIS, such as protein secretion, apical junction, androgen response, and MYC target pathways were significantly activated in secretory cells, while mitotic spindle was found activated mostly in ciliated and secretory cells. Hallmark estrogen response was found in basal and ciliated cells but not in secretory cells (Fig. 3c, Suppl. Figure 2b, Suppl. Table 3b). Viral infection is facilitated and propagated by heightened cell to cell transmission in apical junctions of the luminal mucociliary cells, with virion accumulation in the mucus layer, facilitating infection of neighboring cells47–49. SARS-CoV-2 has also been proposed to hijack microtubule kinases crucial for organization of mitotic spindle to influence viral load and invasion. Based on this, anti-cancer drugs that block mitosis through disruption of microtubule organizing center have been tested in COVID-19 human clinical trials50. Yet, since these effects appear to be largely population cell-specific, these drugs may not be effective in an heterogeneous cellular context, thus underlining the relevance of the information provided by the SIS. The seemingly unrelated pathways described above collectively reflect the complex orchestration of events associated with COVID pathogenesis in the airway epithelium, a system known for its significant cellular diversity.
At the individual protein level, analysis of the SIS signature identified ITGA2, NCOR1 and HDAC1 as the top three most activated host-cell proteins in ciliated cells following SARS-CoV-2 infection (Fig. 3b). ITGA2 encodes the alpha subunit of a transmembrane receptor integrin binding present in anchoring junctions to mediate cell-cell and cell-matrix interactions. A number of studies show that human adenovirus, herpesvirus, and other viral pathogens internalize into cells through a mechanism of integrin-mediated endocytosis51,52. Notably, there is evidence of SARS-CoV-2 binding to Integrin and activation of endocytosis as an essential ACE2-independent component of the viral infection51,53. ITGA2 has also been linked to p38 and p16-mediated senescence of SARS-CoV-2-infected cells54.
The two additional MR proteins, representing key putative mechanistic determinants of SARS-CoV-2 infection in ciliated cells, were NCOR1 and HDAC1. NCOR1 is part of a family of transcriptional co-repressors shown to form complexes with both class I and class II histone deacetylases (HDAC) in vitro and in vivo to repress gene expression. HDAC inhibitors repress ACE2 expression and other genes responsible for viral entry. Yet, the specific mechanisms by which activation of NCOR-HDAC in ciliated cells promotes viral entry are not fully understood55,56. Other HDAC family members identified in our screen included HDAC7 and HDAC2, among the most activated proteins in secretory cells and basal cells, respectively (Fig. 3b). Of interest, HDAC2 has been shown to interact with NSP5, the main SARS-CoV-2 protease to inhibit HDAC2-mediated IFN responses2. Our finding of the lysine acetyl transferase KAT2B as the most aberrantly activated MR in infected secretory cells could further support the idea of a mechanism of SARS-CoV-2 repression of IFN-related gene expression via chromatin remodeling to increased virulence. SMC3 (Structural Maintenance of Chromosomes 3) emerged as the second most aberrantly activated MR in secretory cells. SMC3 is a key component of the cohesin complex, which plays a critical role in the regulation of chromosome structure and gene expression. SMC3 is reported as a non-histone substrate of HDAC in cancer and has been shown to be involved in the restructuring of the chromatin architecture in SARS-CoV-2 infection57,58. Specific activation of these mechanisms in infected secretory cells is intriguing and worth further future studies.
Consistent with prior studies, VIPER analysis identified MRs representing key, complementary subunits of the middle module of the mediator complex, including MED21 and MED7—aberrantly activated in ciliated and basal cells, respectively—and MED31—aberrantly inactivated in basal cells. This complex associates directly with RNA polymerase II to regulate its function59. Biochemical structural data have shown that MED7 and MED21 form a hinge that enhances stable interactions between RNA polymerase II and the mediator complex, indicating that the activation of these proteins may be associated with increased transcriptional machinery activity59. Together the data suggested engagement of distinct SARS-CoV-2-mediated epigenetic reprogramming mechanisms in these populations of airway epithelial cells.
Proviral host factors crucial for SARS-CoV-2 infection are selectively induced in different cell types.
We then examined whether host factors previously reported to physically interact with SARS-CoV-2 proteins or shown to be critical for infection (proviral replication), based on CRISPR studies3,31,32, were differentially enriched in MR proteins. Host factors co-opted by SARS-CoV-2 during infection have been previously investigated, using genome-scale CRISPR knockout screens, in multiple cell types including Vero.E6 (monkey)6, Huh.7.5 (human hepatocarcinoma)4, A549 (human lung adenocarcinoma)32, and Calu-3 (immortalized human airway-like cell line derived from submucosal glands)3,31. These screens helped identifying several genes, including ACE2 and HMGB1, whose inactivation increased the viability of infected cells compared to non-targeting sgRNAs. From these studies, we selected the top 50 TFs and cofactors identified by CRISPR screens in A549 and Calu-3, based on their common origin from the lung/airway epithelium.
Indeed, UIS analysis recapitulated several of these proviral factors among the 50 most differentially active proteins (i.e., candidate MRs of host-cell hijacking, hereafter MRs for simplicity) across all cell types, at both 3 dpi and 6dpi (Fig. 4a). Leading-edge analysis revealed IRF1, IRF9, and STAT1 as the top most conserved factors across all three cell populations (Fig. 4b). The identification of interferon-regulatory factors and core transcriptional regulators of the inflammatory response as leading edge proteins—i.e., proteins identified among the most significant MRs but also positively modulating virulence, as supported by previously published CRISPR screens3,31,32—was intriguing as it suggested a dual role for these proteins as both proviral and antiviral factors during SARS-Cov-2 infection. Additional top leading-edge proviral factors identified by UIS signature analysis included MRs of distinct functional classes, including several previously implicated in viral internalization. Among these, four ATPase and accessory proteins family members (ATB8B, ATP6AP2, ATP6V1C1, ATP6V1H) were among the most aberrantly activated. Studies using bafilomycin to inhibit vacuolar-ATPase60 suggest that these ATPases act as proviral factors by facilitating SARS-CoV-2 entry through the endosomal route. This is further supported by our findings of the kinesin KIF13B and the small GTPAse RAB14, known regulators of intracellular trafficking, vesicle formation and endosomal recycling61,62, as key leading edge MRs (Fig. 4b).
We then examined which of the top 50 proviral factors were also nominated as MRs by the SIS analysis. USP33, CUL5, SNX27, and PBRM1 emerged as the only leading-edge proviral factors identified as activated MRs in all the three cell types. (Fig. 4c-e). Interestingly, two of these factors have been functionally implicated in the regulation of ubiquitination in viral infection. USP33 is a deubiquitinase enzyme reported to modulate the host inflammatory response and antiviral activity through regulation of the turnover of IRF963. CUL5, a component of the Cullin-5-RING E3 ubiquitin-protein ligase complex, has been shown to act as a critical antiviral host factor promoting ubiquitin-dependent degradation of the SARS-CoV-2 ORF9b protein64. The Sorting Nexin27 protein (SNX27) controls cargo recycling from endosomes to the cell surface inhibit viral lysosome/late endosome entry by regulating ACE2 abundance. Indeed, SNX27 can be hijacked by SARS-CoV-2 to facilitate viral entry65. PBRM1 is a chromatin modulator of the SWI/SNF chromatin remodeling complex, which includes SMARCA4, SMARCB1, SMARCC1, ARID1A, DPF2, SMARCE1, also implicated in ACE2 expression regulation and, thus, in host cell susceptibility to viral infection66.
Taken together, these data reinforce the hypothesis that epigenetic reprogramming represents as a key mechanism leveraged by SARS-CoV-2 for hijacking host-cell programs in the airways epithelium. The analysis also helped identifying proviral factors that are either preferentially activated in a specific SARS-CoV-2-infected subpopulation or those shared across different subpopulations. Intriguingly, among the 50 leading-edge proviral factors present in the SIS signature, most were found activated in ciliated, basal cells or in both populations but not in secretory cells (Fig. 4d-e). Comparative analysis of the SIS and UIS MR signatures confirmed a lower number of proviral factors differentially activated in secretory cells (Suppl. Figure 3). However, it is important to note that SARS-CoV-2 induces a robust MR activation selectively in secretory cells, as demonstrated by the SIS analysis of this subpopulation (Fig. 3b).
Taken together, the fact that our analysis identified key proteins known to modulate SARS-CoV-2 infection, suggests that top population cell-specific MRs, as well as those conserved across multiple cell populations, that were not previously characterized may represent key regulators of novel mechanisms of host-cell hijacking by the virus, for future low-throughput validation assays. Moreover, the diversity of factors revealed by these analyses further reinforced the idea that the signature elicited in the airway epithelium by SARS-CoV-2 represents a complex combination of host defense and virus-hijacked signals.
A large-scale functional drug screen identifies candidate drugs that effectively invert the SARS-CoV-2-induced MR signatures in the human airway epithelium.
To identify drugs capable of inverting the MR activity signature induced by SARS-CoV-2 infection, we performed a large-scale drug perturbation screen in these organotypic airway epithelial cell cultures, thus supporting prediction of candidate MR-inverter drugs using the ViroTarget and ViroTreat30 algorithms. These represent the direct extension to the viral infection context of OncoTarget18,20 and OncoTreat18,19,67, two CLIA-compliant (https://www.cms.gov/medicare/quality/clinical-laboratory-improvement-amendments) algorithms extensively validated in both a pre-clinical and clinical oncology contexts. Drugs predicted as MR activity inverters by OncoTreat, from both bulk18,67 and single-cell68 profile data, have been validated by rigorous in vitro and in vivo assays. Consistent with this premise, we proposed that pharmacologic targeting of either individual MRs (ViroTarget) or of the entire MR protein module that regulates the virus-induced host-cell response (ViroTreat) could mitigate viral replication and infection co-morbidity. Briefly, ViroTarget analyzes the list of MRs to assess whether any of them may represent a high-affinity binding target of a clinically-relevant drug, among the 1,738 in DrugBank69. In contrast, ViroTreat assesses inversion of the virus-induced MR activity signature by assessing the enrichment of the 25 most activated and 25 most inactivated viral infection MRs in proteins differentially inhibited and activated in drug vs. vehicle control-treated cells, respectively. For this analysis, we used the 3dpi SIS signature, representing the most specific early hijacking of host cell programs by the virus.
ViroTarget identified 32 MRs (p ≤ 0.05, Benjamini-Hochberg corrected) aberrantly activated in at least one of the infected cell types (ciliated, basal, or secretory cells)—as high-affinity targets of 87 small molecule inhibitors in DrugBank. (Suppl. Figure 4). Druggable proteins associated with epigenetic control of gene expression, such as the HDAC family members HDAC1, HDAC2, HDAC3 and HDAC9, were identified as SIS MRs in distinct cell types. Consistently, HDAC inhibitors, such as romidepsin identified by Virotarget were already proposed as antiviral drug candidates30,70. Additional MRs representing chromatin remodeling enzymes include EZH2, DNMT1 and CHD1; the latter is a chromatin organization modifier implicated in the recruitment of Influenza virus polymerase to promote viral multiplication71. These findings underscore the relevance and need for further investigation of epigenetic mechanisms hijacked by SARS-CoV-2 to induce host-cell reprogramming. ViroTarget also identified two out of three RXR retinoid receptors as candidate druggable MRs, as also supported by independent evidence from the functional analysis of SARS-CoV-2 infected human IPSC-derived organoids9. Another notable druggable MR identified by ViroTarget was KRAS. This protooncogene found frequently mutated in human cancers has been implicated in viral stress responses mediated by GRP78 a chaperone induced by SARS-CoV-2 infection72,73. Taken together, these data suggest that ViroTarget analysis can recapitulate several previously identified drugs as well as nominate additional ones for follow-up validation.
ViroTreat analysis requires a large-scale compendium of RNA-Seq profiles representing the response of cells to treatment with a large library of clinically relevant compounds. For this purpose, we used the PLATE-Seq technology and VIPER, developed by our labs10,11,74 to generate protein activity from the RNA-Seq profiles of airway epithelial cultures treated with 441 FDA-approved drugs. We selected drugs with well-characterized bioactivity, safety, and bioavailability properties, as determined by preclinical and clinical studies. VIPER analysis of RNA-Seq profiles of drug vs. vehicle control-treated cells helps characterize the proteome-wide mechanism of action (MoA) of each drug, which can be used to assess the drug’s ability to invert the activity of the MR signatures identified from VIPER analysis of the SIS signature of each individual subpopulation (Fig. 5a) (see methods). Here we define MoA as the repertoire of proteins that are differentially activated or inhibited by a drug in a tissue of interest, including high-affinity binding targets, secondary lower-affinity targets, and context-specific indirect targets. Taken together, these proteins define the drug’s pharmacologic activity18.
For this purpose, drugs were added to differentiated ALI day 21 airway organotypic cultures at the their Cmax concentration and RNA-Seq profiles were generated at 24h post treatment, using the fully automated PLATE-Seq technology74 (see methods) and analyzed by VIPER. Out of 441 drugs, ViroTreat identified 11 as statistically significant MR-inverters across all three airway epithelial cell types (Fig. 5b, Suppl. Figure 5a,b, Suppl. Table 4). We further investigated the enrichment of MR in the leading edge of prior CRISPR screens, as described in the previous section (Fig. 4d,e). ViroTreat analysis showed leading edge MR activity inversion in nearly all cell types, with 9 out of 11 drugs (Fig. 5c, Suppl. Figure 6a). Notably, bedaquiline—an established inhibitor of ATP synthase in drug-resistant pulmonary tuberculosis75—was identified as a candidate inverter of USP33 and CUL5 activity.
Pathway enrichment analysis using gene sets from Reactome Pathway (RP)76 showed that 8 out of the 11 drugs identified by ViroTreat also induced statistically significant negative NES for pathways associated with virus-induced processes, including membrane trafficking, infectious diseases, post-translational protein modification, and Rho GTPase signaling (Fig. 5d). Among the MRs whose activity was inverted by these drugs we found SMC3 and MED7 (see previous sections), as well as CREB1, USP33, ZFK451, SMARCA5 and others proteins reported in SARS-CoV-2-host interactions and pathogenesis databases (Suppl. Figure 6b)61. ViroTreat also identified USP33, PAWR, ATP6AP2, CUL5, ROCK1, RAB14 as proviral MRs inactivated by 11 out of the 411 drugs screened. While most of the factors targeted were enriched in ciliated cells, two of them (CUL5, USP33) were present in all cell types (Suppl. Figure 6a). Thus, the effect of ViroTreat-inferred MR-inverters drugs extends to proteins controlling mechanisms associated with viral replication. As such, these drugs should not only mitigate the hijacking of host-cell programs but also directly restrict viral replication.