Development and validation of hUSI
To systematically learn and evaluate the comprehensive signature of CS, we developed a workflow including data collection, data re-processing, feature extraction and quantification of senescence degree (Fig. 1a). We first collected RNA-seq data sets derived from representative human senescence types serving as senescence training samples, which encompassed five cell types (fibroblasts, endothelial cells, astrocytes, melanocytes, and keratinocytes) and six senescence types (ionizing radiation-induced senescence (IRIS), replicative senescence (RS), oxidative stress-induced senescence (OSIS), oncogene-induced senescence (OIS), natural senescence (NS) as well as compound-induced senescence (CIS)), along with the corresponding non-senescent samples serving as young controls19–27 (Extended Data Fig. 1a,b and Supplementary table S1). Next, considering these data sets were derived from different experimental methods and sequencing protocols, we re-preprocessed all the raw data with the same pipeline to generate standard and normalized profiles for feature extraction and validation (Extended Data Fig. 1b and Methods). After processing, we found that, as expected, CDKN1A and CDKN2B (encode the well-known senescence marker p21 and p15, respectively) showed significant higher expression level, demonstrating reliability of samples and the overall analysis pipeline (Fig. 1b).
Then, we went on to acquire features of the senescence profiles by machine learning algorithms. Multiple machine learning algorithms have been employed for mining genes associated with individual aging or CS, such as regression, elastic net, and random forests28–30, facilitating quantification of senescence degree in tumor and normal cells. However, senescence, as a complicated and continuous state, its heterogeneity has not been fully considered in these models31. In this study, we selected one-class logistic regression (OCLR) algorithm to learn the features of senescence transcriptome profiles, as it has been demonstrated superior performance on capturing cell heterogeneity32. After training, OCLR learned the features of senescence samples in training set, in other words, all genes were respectively assigned with different weights representing their contributions to senescence (Methods). In our learned-senescence features, except for genes upregulated in CS and associated with SASP (such as APOD, EHF and SAA2)33–35 were assigned with top weights, some genes with high positive or negative weights while their functions in CS were poorly reported (such as OLAH, CADM3 and HMSD) (Extended Data Fig. 1c). These results suggest that OCLR not only captures the known senescence features but also identifies potential novel senescence-associated genes. To our surprise, classical senescence marker, including CDKN1A, CDKN2B and SERPINE1, are only assigned with slightly positive weights, probably because of relatively low expression levels in samples (Supplementary table S2). To further examine the biological interpretability of learned-senescence features, gene set enrichment analysis (GSEA)36 was performed based on the weight of each gene (Methods). We found that multiple senescence associated gene sets were positively enriched, including interferon-gamma response37, KRAS signaling38, inflammatory response39, hypoxia and p53 pathway40, 41 (Fig. 1c, left panel). On the contrary, the proliferation associated pathways (such as G2M checkpoint42, E2F targets43, mitotic spindles44 and MYC targets45) were in the negative enrichment terms (Fig. 1c, right panel). These results supported the reliability of the features learned by OCLR in reflecting senescence.
In terms of quantifying senescence degree, the Spearman correlation coefficient between gene weights and expression values was selected as the metric to quantify senescence degree, defined as human universal senescence index (hUSI) (Methods). To test the stability and reliability of hUSI, we used leave-one-out cross-validation (LOOCV) strategy to calculate average correctly rank probability (CRP) for each iteration, and the resulted CRP reached 0.9 (Extended Data Fig. 1b, d and Methods). We next validated whether hUSI can be influenced by batch effects arising from variations in experimental conditions, sequencing platforms, or analysis pipelines. We compiled bulk RNA-seq datasets from seven independent studies, each comprising oncogene-induced senescent IMR90 cells induced by 4-hydroxytamoxifen (4-OHT), along with corresponding control cells24, 46–49 (Supplementary table S1). We calculated hUSI for each sample and found that all senescent groups have much higher hUSI values (hUSIs) than the non-senescent groups (Fig. 1d). Besides, with a plethora of genes included in the senescence features for calculation, hUSI is technically more robust to profiles with limited or sparse gene signals such as microarray and single-cell RNA-seq (scRNA-seq) data. Therefore, we applied hUSI on six senescence-related microarray datasets and the most of them were derived from cell types which were not included in training set50–55. The results showed that hUSIs were consistently higher in all senescent groups compared to non-senescent ones (Extended Data Fig. 2a). To test the robustness of hUSI, we generated simulated sparse profiles from these microarray transcriptome profiles by randomly zeroing-out expression signals. We found that even zeroing-out 50% of genes expression signals, hUSIs still represented higher levels in all senescent groups (Extended Data Fig. 2b). All these results suggested that hUSI, based on comprehensive senescence features and effective nonparametric rank-based correlation56, is pretty stable and robust.
hUSI shows reliable performance in quantifying senescence degree
To assess the generalizability of hUSI, we gathered three bulk RNA-seq datasets (including immortal MDAH04 cells and senescent MDAH04 cells induced by different chemical compounds57, WI-38 cells treated with 4-OHT for different days58 and proliferative WI-38 cells and senescent WI-38 cells induced by replication58), notably the conditions of these samples are not exactly same as samples in the training set (Supplementary table S1). We found that most senescent groups exhibited significant higher hUSIs compared to non-senescent ones even the sample size is limited (Extended Data Fig. 2a). Moreover, hUSI also demonstrated its ability to discern aggravated senescence in samples induced by extended 4-OHT exposure time (Fig. 1e, middle panel).
Next, we calculated hUSIs for a large normal samples dataset obtained from the Genotype-Tissue Expression Project (GTEx). We observed that hUSIs progressively and significantly elevated with increasing age, consisting with the continuous accumulation of senescent cells in aging process59 (Fig. 2a). To validate the reliability of hUSI in assessing senescence degree, we calculated Spearman correlation coefficient between hUSIs and CS scores, which was a tool based on conducting gene set variation analysis (GSVA) on a curated set of 1,259 genes derived from studies on replicative cell senescence17. The results showed overall positive correlations of these two methods (R = 0.7), and across 29/30 tissues (R from 0.28 to 0.85) (Fig. 2b left panel and Fig. 2c upper panel). The same strategy was applied on a large tumor samples dataset from The Cancer Genome Atlas (TCGA) datasets. Despite the heterogeneity in tumor samples, hUSIs still showed overall positive correlations with CS scores (R = 0.52) and across all cancer types (R from 0.12 to 0.94) (Fig. 2b right panel and Fig. 2c lower panel). Of note, we discovered that hUSIs demonstrated higher variations in different cancers compared to CS scores, which might indicate that hUSI enables to reveal more intrinsic heterogeneity of senescence across different tumor types (Fig. 2c lower panel).
hUSI has better performance in distinguishing senescence cells
Given the reliable and robust performance of hUSI on scoring bulk samples under various conditions, we next applied hUSI on four scRNA-seq datasets derived from primary senescent cells induced by various stressors (including oncogene49, 60, radiation61, and replication61, as well as secondary senescent cells triggered by paracrine signals62) to assess the robustness of hUSI at single-cell level across diverse conditions. The senescence status of these cells had been confirmed in respective studies by examining senescence marker genes and senescence-associated β-galactosidase (SA-β-Gal) staining49, 60–62 (Supplementary table S1). Non-senescent cells from each dataset were also included for comparative analysis. After quantifying the senescence degree of each cell using hUSI, we observed significantly higher hUSI levels in senescent groups than non-senescent groups across all four datasets, supporting the applicability of hUSI on scRNA-seq data (Fig. 3a). Next, we compared the performance of hUSI with other three groups of senescence qualification strategies (including those based on gene expression level, computed score and enrichment score of single sample GSEA (ssGSEA)) (Methods).
First, we first obtained 12 well-known CS or proliferation associated markers (GLB1, TP53, CDKN1A, CDKN2A, CDKN2B, CDK1, CDK4, CDK6, MKI67, LMNB1, IL1A, and RB1) and separately used their normalized expression values to directly classify cells, as their upregulation or downregulation is widely employed to identify CS status63–71. We found that only CDKN1A exhibited a higher trend in senescent samples than control samples across all datasets (Extended Data Fig. 3a, left panel). To better compare the performance of hUSI and the markers in classifying senescent cells in limited scRNA-seq datasets, we randomly split each dataset into 10 folds and replicated the process three times, and then calculated the ranks of average Area Under Curve (AUC) of all units in each dataset (Supplementary table S3 and Methods). We observed that hUSI exhibited excellent performance compared to all the tested classical markers (Fig. 3b, left panel and Supplementary table S3).
Second, we compared hUSI with five existing senescence score computing methods, including DAS, mSS and their combination (DAS + mSS)14, lassoCS18 and CSS28. To our surprise, these methods only gave senescent group a higher score level than control group in certain datasets (Extended Data Fig. 3a, middle panel). We then applied the same strategy above to calculate average AUC ranks. hUSI also achieved the highest average AUC rank compared to all computed senescence scores (Fig. 3b, middle panel). Additionally, we observed that DAS + mSS, as expected, outperformed both DAS and mSS individually (Fig. 3b, middle panel). Of note, except for hUSI, all these methods exhibited substantial variations across four datasets (Fig. 3b, middle panel), supporting the more stable performance of hUSI.
Finally, considering aging and senescence-associated gene sets have been commonly used to quantify CS by enrichment score using ssGSEA, in the present study, we collected eight publicly available senescence-associated gene sets (including CellAge15, GenAge72, ASIG73, SASP (downloaded from MSigDB under acessesion ID R-HSA-2559582), AgingAtlas74, SenUp75,SenMayo16, and SigRS76) to calculated their ssGSEA scores in four scRNA-seq datasets (Supplementary table S4). The result showed that only SenUp gave higher scores for senescent groups than control groups across four scRNA-seq datasets (Extended Data Fig. 3a, right panel). After calculating average AUC ranks, hUSI still exhibited superior performance over all gene sets, with minimal variation observed across the four single-cell datasets (Fig. 3b, right panel). Furthermore, we found that genes from all these gene sets can be found in our features, and genes had been assigned with different weights which enable hUSI to capture a broader spectrum of gene expression signals in the senescence evaluation process (Supplementary table S2,4). These results above combined to suggest that hUSI has relative superiority and stability across different scRNA-seq datasets comparing to other current methods.
hUSI enables to evaluate senescence burden in complex conditions
After validating the outperformance of hUSI in distinguishing senescent cells, we next sought to apply hUSI on single-cell data from real pathological tissues. The accumulation of senescent cells has been reported to increase the susceptibility to COVID-19 patients by contributing to SARS-CoV-2-mediated hyperinflammation and cytokine storm77, 78. Consequently, the targeted elimination of these senescent cells has been proposed as a potential treatment strategy for COVID-1977, 78. However, the deconvolution of senescent status across various cell types in infected lung tissues and the study of detrimental effects of different senescent cells on patient survival are still lack. Thus, to evaluate the senescence burden of COVID-19 patients, we calculated the hUSIs for a single-nuclei RNA-seq (snRNA-seq) dataset (containing a total of 116,313 nuclei) derived from infected and normal lungs with donor age ranging from 58 ~ 84 years old79 (Fig. 3c and Methods). We found that most cell types (including epithelial cells, endothelial cells, fibroblasts, myeloid, and neuronal cells) from COVID-19 patients exhibited significantly higher hUSI values compared to those from normal donors (Fig. 3d). Intriguingly, a reverse trend was observed in B cells and T cells, suggesting the activation of immune cells following COVID-19 infection80 (Fig. 3d).
To better discern various senescence status, we applied a gaussian mixture model (GMM) to fit the distribution of hUSIs within all tested cells and successfully classified them into four distinct classes (C1 ~ C4) (Fig. 3e, left panel and Methods). Their senescence degrees were further validated by the higher expression level of classical senescence-associated genes (CDKN1A, IL1A, IL6, IL8, CCL2, CXCL10, MMP9, SERPINE1, THBS1 and TIMP1) and lower expression level of proliferation markers (LMNB1, MKI67 and DHFR), consisting with the reported elevated cell senescence responses to SARS-Cov-2 infection81 (Fig. 3e, right panel). We also observed that COVID-19 lung tissue has a higher proportion of the most senescent cell class (denoted as C4) cells compared to normal tissue (Fig. 3f), consistent with the reports suggesting a high accumulation of senescent cells in COVID-19 patients77, 82. Besides, patients with faster disease progression showed more accumulation of senescent cells (Fig. 3g). These results all suggested that hUSI successfully revealed survival-detrimental senescent cells accumulated in COVID-19 lung tissue across various cell types。
We then examined the difference in the fraction of four cell groups for each cell type between COVID-19 and the normal samples. The results showed there are higher fractions of senescent cells existed in the cell types with a higher risk of exposure to SARS-CoV-2 or hyperinflammatory microenvironment, such as monocyte-derived macrophages, inflamed endothelial cells, pathological fibroblasts and alveolar type 1 progenitor cells (AT1) (Fig. 3h and Extended Data Fig. 4a, b)78, 83–85. While alveolar type 2 progenitor cells (AT2), which are targeted by SARS-CoV-2 through the angiotensin-converting enzyme 2 (ACE2), was reported to exhibit apparent senescence and a proinflammatory phenotypes86, AT1 accumulated in a higher proportion in COVID-19 lung tissue than AT2, possibly because AT2 can differentiate into AT1-like cells for alveolar regeneration in COVID-19 patients87.
To investigate the alterations in senescent cells, we performed differential gene expression analysis between C4 and C2 (which is the second young class (Fig. 3f)). We did not take C1 class as the control due to its very small cell numbers, which usually lead to some bias in differential analysis. Differential genes (DEGs) of AT1 and AT2 were respectively enriched on KEGG and GO databases. The results showed that senescent AT1 and AT2 cells have enriched on pathways associated with antigen process, extracellular matrix and immune cytotoxicity, especially AT1 has enriched on p53 signaling pathway, indicating a higher relevance of these senescent on infection response, cellular communication and cellular senescence (Fig. 3i and Extended Data Fig. 4c). In addition, DEGs were also enriched in senescent monocyte-derived macrophages as it showed largest fraction difference in C4, reaching 0.29, and was reported to drive the inflammatory response to SARS-CoV-2 and contribute to cytokine storms in severe COVID-1988, 89. The results showed that pathways, including positive regulation of T cell-mediated immunity and leukocyte-mediated cytotoxicity, was enriched in these cells, indicating their crucial roles of senescent cells in macrophage-mediated clearance of infected cells, which may also cause damages to infected tissues by hyperinflammatory90 (Extended Data Fig. 4d). All these results above demonstrated that hUSI enables to recognize senescent cells that abnormally accumulated in pathological tissue and reveal associated mechanisms.
hUSI identifies immune associated senescent tumor cells in melanoma
We then sought to apply it on tumor samples, as CS plays an important role in tumor development and can activate immune responses91. Significant progress has been made in immunotherapy of melanoma, especially with the application of immune checkpoint inhibitors, such as PD-1 antibodies and CTLA-4 antibodies, which result in significant durable responses and therapeutic efficacy92, 93. However, the mechanisms underlying immunotherapy remain incompletely understood. Several studies have demonstrated the relationship between senescent tumor cells and immune recognition94–97, thus we sought to identify senescent tumor cells and investigate whether it could serve as potential targets for immunotherapy in melanoma.
To explore the characteristics of senescent tumor cells in melanoma, we evaluated the senescence degree of tumor cells by applying hUSI on a melanoma scRNA-seq data set98. We then used GMM to infer three cell subpopulations which were denoted as cycling, transactional and senescent, based on the significantly increasing hUSI level (Extended Data Fig. 5a, b). The senescence degree of these subpopulations was further validated by a microarray-based transcriptome dataset of melanocytes52. By overlapping DEGs of melanocytes bulk samples with specific highly expressing genes in our defined cell subpopulations (Methods), we found that genes up-regulated in senescent melanocytes were significantly enriched in senescent and transitional subpopulations (Fig. 4a and Supplementary table S5). On the contrary, genes up-regulated in growing melanocytes were significantly enriched only in cycling subpopulation (Fig. 4a and Supplementary table S5). We also validated the different senescence degree of these three subpopulations by inferring a senescence trajectory of tumor cells. We imputed 38 co-expression modules based on hUSI-related genes and diffusion map was used for dimensionality reduction and visualization. The senescence trajectory was characterized by the transition of tumor cells from cycling to senescent status (Fig. 4b and Extended Data Fig..5c, d and Methods). Two well-known senescence hallmark genes, CDKN1A and SERPINE1, showed higher expression level in senescent subpopulation than in the other two subpopulations (cycling and transitional) (Fig. 4c). Moreover, GSEA results of specific highly expressing genes in each subpopulation indicated more frequent immune activities occurred in senescent tumor cells (Fig. 4d and Methods). These results demonstrated that heterogeneity in senescence existed among melanoma tumor cells, and hUSI can reliably distinguish senescent tumor cells.
We next analyzed the impact of senescent tumor cells on melanoma patient survival. We took three tumor cell subpopulations as a reference expression profile and deconvoluted RNA-seq profiles of melanoma cohort from TCGA-SKCM using EpiDISH99, obtaining the proportion of each subpopulation in each melanoma patient (Methods). Considering the potential relationship between senescent tumor cells and immune response, we also calculated abundances of 22 immune component using CIBERSORT100. We found that the proportion of senescent subpopulation have a higher positive correlation with the abundance of M1 macrophage cells, CD8 T cells, and activated immune cells (including activated CD5 memory T cells and activated dendritic cells) (Fig. 4e). Survival analysis was then performed based on the inferred proportions of these subpopulations. The result showed that the higher proportion of senescent or transactional subpopulations in a patient, the more favorable it was for the patient’s survival, and the significance of senescent is higher than transactional (Fig. 4f). In contrast, patients with higher proportion of cycling subpopulation have worse prognosis (Fig. 4f and Extended Data Fig. 5e). These results suggested that hUSI-aided senescence state evaluation of tumor cells can serve as a promising prognostic biomarker for melanoma patients.
hUSI recognizes special signaling pathways in senescent melanoma cells
In the above analysis, hUSI helps identify senescent tumor cells in melanoma. However, the role of senescent cells in tumor microenvironment is very complex and highly dependent on the physiological environment101–103. Senescent cells can communicate with neighbor cells and influence their behavior through paracrine signaling. Specifically, SASP presented by senescent tumor cells plays important roles in communication with immune system by attracting immune cells (such as T cells and NK cells) and then leading to the clearance of senescent tumor cells95–97. Besides, CS associated communication had been speculated to regulate immune surveillance and influence tumorigenesis104. Therefore, understanding how senescent cells interact with the microenvironment may provide additional clues for the relationship between senescence and tumorigenesis.
To explore the cross-talk between senescent tumor cells and the microenvironment in melanoma, we investigated the cell-cell communication between these three tumor cell subpopulations (cycling, transactional and senescent) and their neighboring cells using CellChat105 (Fig. 5a). The results showed that the communication strength of hUSI-identified senescent subpopulation was higher than the other two relatively less senescent tumor cell subpopulations (cycling and transactional), indicating stronger cell-cell communication between senescent tumor cells and neighboring cells (Fig. 5b and Extended Data Fig. 5f). Furthermore, analysis of the global output communication patterns uncovered two different signaling patterns, with pattern 1 corresponding to the senescent tumor subpopulation and pattern 6 corresponding to the cycling and transitional subpopulations (Extended Data Fig. 5g). To analyze which pathways were responsible for senescent tumor cells to receive communication signals from tumor microenvironment, we compared communication strength of each involved signaling pathway (Methods). Six pathways (including transforming growth factor β (TGFβ) pathway, leptin (LEP) pathway, chondroitin sulfate proteoglycan 4 (CSPG4) pathway, chemokine signaling pathways (CCL), CD6 pathway and bone morphogenetic protein (BMP) pathway) were found to have input signal strength to senescent subpopulation and not detected in cycling subpopulation, indicating that these signaling pathways are more likely to specifically function in senescent tumor cells (Fig. 5c).
In the two major pathways, TGF-β can induce senescent phenotype of tumor cells, which is secreted by macrophages originating in tumor stroma106, 107, and BMP is a family of TGF-β superfamily, which has similarly been found to induce senescence of tumor cells108, 109. Through signaling pathways pathway network, we found that senescent tumor cells receive TGF-β from macrophages (Fig. 5d), which is consistent with previous report in lymphoma107. Interestingly, senescent tumor cells received more TGF-β from cancer-associated fibroblasts (CAFs) (Fig. 5d). This may indicate that as a solid tumor, melanoma differs from lymphoma in microenvironment by the presence of a high number of fibroblasts. Moreover, senescent tumor cells received BMP from a variety of cell types in the microenvironment, of which the signal from T cells was the strongest (Fig. 5d). Further investigation of ligand-receptor interactions in signaling pathways revealed that the expression level of genes encoding receptors for TGF-β and BMP were higher in senescent subpopulation than in cycling and transitional subpopulations, with TGFBR2 and BMPR1B showing more significant differences among these three subpopulations (Fig. 5e). In addition, survival analysis performed on TCGA-SKCM also showed that patients with higher expression level of TGFBR1, TGFBR2 and BMPR2 have better prognosis (Fig. 5f), consistent with the idea that stronger interactions by these pathways between senescent tumor cells and microenvironment could benefit patient survival.
We also noticed the other four signaling pathways which are also specific for senescent tumor cells. While transitional and senescent subpopulation interact with T cell by LEP signaling pathway and with CAFs by CSPG4 signaling pathway, the receptors involved in these two pathways did not show significant difference (Extended Data Fig. 6a). Notably, senescent subpopulation interacts with T cell by CD6 signaling pathway and with macrophages by CCL signaling pathway. CD6 receptor encoding gene ALMCAM and CCL receptor encoding gene CCR10 are specifically highly expressed in senescent subpopulation and benefit patients’ survival (Extended Data Fig. 6b, c). Although, these signaling pathways are reported associated with tumor progression110–113, their functions on tumor cell senescence need further study. Overall, these results highlight the clinical value of hUSI in identifying senescent tumor cells and the potentially involved signaling pathways.