Integrated genomic analysis identifies 15 subtypes of T-ALL
Using the uniform manifold approximation and projection (UMAP) technique11, we projected RNA-seq gene expression data in two dimensions and employed the Leiden12 algorithm for data clustering (Fig.1a, Extended Data Fig.1b-c, Supplementary Table 3-7). To identify genomic drivers of each cluster, we examined associations between subtypes and genomic alterations identified from analysis of sequence and structural DNA variants (SV). Putative drivers were identified in 95.1% of cases (1245 of 1309), 59% (777 of 1309) of which were in non-coding regions of the genome (Fig.1b-c). WGS was required to identify drivers for 29% (378) cases, particularly for detecting inversions, translocations, and enhancer single nucleotide variants or small insertions/deletions (SNV/indels) in non-coding regions. Subtype-defining drivers were not identified in 4.9% of cases, of which 39% had low tumor purity.
Previous studies have identified 9 distinct transcriptional subtypes of T-ALL3,6. Integrated analysis of WGS and RNA-seq data delineated 15 subtypes, each with specific drivers and patterns of oncogene expression (Fig.1d, Extended Data Fig.1b-g). Several of these subtypes were previously recognized, including those with deregulation of the TAL1/TAL2/LMO1/LMO2/LYL1 transcriptional regulators, or deregulation of TLX1, TLX3, NKX2-1 or HOXA9. This analysis enabled the identification of additional genomic alterations deregulating these drivers, as well as subdivision of several of these subtypes into subgroups based on genomic and clinical features. For example, deregulation of TLX3 in the TLX3 subtype is commonly due to hijacking of the BCL11B (ThymoD) enhancer13, but we identified multiple additional enhancers that were rearranged (R) and deregulated TLX3, including the T-cell receptor (TCR) β locus, CDK6, the NOTCH1-driven MYC enhancer14,15(N-Me), and SATB1 enhancers. TLX1 activation commonly arises from rearrangement to TCRβ or δ locus, but we also identified a diverse range of intergenic losses, translocations, and inversions that resulted in TLX1 deregulation in the TLX1 subtype. Similarly, NKX2-1 activation was achieved by TCRδ loci rearrangements, along with recurrent chromosome 14 chromothripsis (Supplementary Fig.1), intergenic losses, rare enhancer hijacking events, and enhancer gains, all contributing to elevated expression in the NKX2-1 subtype. The NKX2-5 subtype harbored rearrangements of NKX2-5 to TCRβ/δ loci or BCL11B enhancer hijacking16. A subtype of cases with HOXA9 deregulation was characterized by HOXA9 hijacking the TCRβ locus enhancers17. SPI1 fusions with TCF7/STMN118, and YWHAE were hallmarks of the SPI1 subtype. The BCL11B subtype harbored rearrangements juxtaposing hematopoietic stem cell (HSC) enhancers to BCL11B, previously identified in lineage ambiguous leukemia and ETP-ALL19,20. STAG2/LMO2 T-ALL most commonly harbor a LMO2::STAG221 rearrangement that activates LMO2 and inactivates the cohesin gene STAG2.
Cases with deregulation of the TAL1/TAL2/LMO1/LMO2/LYL1 core transcriptional circuitry could be divided into two subtypes previously termed TAL1-RA and -RB17. Considering our tumor and normal progenitor genomic data, we term these TAL1 αβ-like and TAL1 double positive (DP)-like subtypes. The TAL1 DP-like subtype exhibits higher expression of RAG1/2, CD4 and CD8, whereas the TAL1 αβ-like subtype is characterized by higher expression of the TCR alpha constant (TRAC) gene and TCRα/β rearrangements (Fig.1d, Extended Data Fig.1f). Known drivers in these subtypes are STIL::TAL1, rearrangements of TAL1, LMO1, LMO2 and LYL1 to TCRδ/β enhancers, and non-coding sequence mutations creating a TAL1 neoenhancer5. Here we identified multiple additional mechanisms of oncogene deregulation, including copy number variation (CNV) gains and SNV/indel mutations generating neoenhancers for TAL1, LMO2, and LYL1, intergenic inversions and deletions that result in enhancer hijacking-mediated deregulation of TAL1 and LMO2.
Two additional subtypes were defined: The ETP-like subtype was enriched for cases of ETP immunophenotype and diverse genomic driver alterations, and LMO2 gamma delta (γδ)-like subtype that has diverse alterations, including LMO2 activation from BCL11B enhancer hijacking, FOXP1 rearrangement, rearrangement of MYC to TCRδ, and enhancer SNV/Indels activating LMO2.
We analyzed the differentiation stage of each T-ALL subtype by projecting gene expression signatures onto single cell RNA (scRNA) and assay for transposase-accessible chromatin (ATAC) sequencing data of normal thymi (n=3) and BM samples (n=5; Fig.1e). The 15 T-ALL subtypes spanned a continuum of normal T-cell maturation, ranging from immature HSC or progenitor cells (HSPC), common lymphoid progenitors (CLP), lympho-myeloid primed progenitors (LMPP) or ETP for the BCL11B and ETP-like subtypes; pro/pre-T for the KMT2A, TLX3 and MLLT10 subtypes; cycling double positive (DP) for HOXA9 TCR, TLX1, NKX2-1 and TAL1 DP-like subtypes; TCRA expressing single positive αβ/mature αβ for TAL1 αβ-like; and γδ/effector T for LMO2 γδ-like that was also associated with TCRγδ rearrangements. Notably, enrichment for myeloid signatures were also observed, including dendritic cells (DC) for the SPI1 subtype, granulocyte/monocyte progenitors (GMP) for NKX2-5, and megakaryocytic/erythroid precursor (MEP) for the STAG2/LMO2 subtype.
We observed associations between subtypes and clinical features. Patients within STAG2/LMO2, NKX2-5 and SPI1 T-ALL were younger at diagnosis and the BCL11B and HOXA9 TCR subtypes included a higher proportion of older patients (Extended Data Fig.1j-k). Sex distribution varied across subtypes, with a higher female proportion in the STAG2/LMO2, ETP-like, HOXA9 TCR, LMO2 γδ-like and NKX2-5 subtypes, and a higher male proportion in the TAL1 subtypes (Extended Data Fig.1l).
Detection of significantly altered coding and non-coding alterations
We used DnDScv and Gistic222,23, and the genomic random interval (GRIN2) to identify significantly recurrent coding and non-coding alterations24. Altered genes were systematically assigned to 17 distinct pathways. Across the entire cohort, we identified 164 recurrent genes (q-value<0.05) and 46 broad CNVs (Extended Data Fig.2a, Supplementary Table 8-22). Many of these recurrent alterations showed subtype specificity (Extended Data Fig.2b). Twenty-one genes had recurrent alterations outside of coding regions that were only detected by GRIN2 regulatory feature analysis. CDKN2A (71% of cases) and NOTCH1 (69%) were the most recurrently altered genes. PHF6, PTEN, LEF1, MYB, MYC, and RUNX1 each had at least 9 genomic mechanisms of deregulation, many of which required WGS for identification (Extended Data Fig.2a). We identified recurrent sequence mutations for 16 previously unreported coding genes, including putative loss of function stop/frameshift mutations for CUL1, PSIP1, NOL4L, KMT2E, KAT6A25, DDX39B and MYL1 and mutation hotspots for CD99, E2F1, LCK and RBMX.
Diverse oncogene-activating non-coding alterations
Enhancer-mediated oncogene activation was observed in 70.5% of cases and 43.9% of cases had enhancer SNV/Indel, translocations or intrachromosomal inversions that required WGS for identification. The underlying mechanisms were highly diverse, including translocations or inversions, as well as chromothripsis or intergenic copy number (CN) losses juxtaposing 12 enhancers to 62 oncogenes observed in 50.8% of cases, and sequence alterations resulting in the generation of putative neoenhancers for 8 genes in 34.8% of cases (Fig.2a).
We used ATAC-seq and histone 3 lysine acetylation (H3K27ac) HiChIP in 19 T-ALL samples encompassing 19 different alterations in 6 subtypes with reference to healthy cord blood-derived CD34+ HSPC and thymic DP normal cell controls to examine open chromatin state, enhancer formation, and interactions of putative enhancers with oncogenes. Fifty genes, 23 of which were recurrent (q value<0.05) hijacked TCR enhancers, and of these, events deregulating HOXA13, HOXA9 and LMO3 were verified by HiChIP (Extended Data Fig.3a-c, Supplementary Table 19). Strikingly, the BCL11B enhancer was hijacked by 16 genes (6 recurrent; q-value<0.05); LMO2, HOXB13, NKX2-1, HOXA13 were validated by HiChIP (Extended Data Fig.3d-g). Several other enhancers were hijacked by specific genes, such as intergenic 11p deletions resulting in LMO2 deregulation driven by hijacking of RAG enhancers (Fig.2b), CAPRIN1 and CELF1 enhancers (Extended Data Fig.3h), TAL1 inversion to the DHX9 enhancer (Extended data Fig.3i), TLX1 hijacking of the LINC00592 enhancer via intergenic loss; LMO2 hijacking of MIR2117HG enhancer (Extended Data Fig.3j), NKX2-1 hijacking of the NFKBIA enhancer (Extended Data Fig.3k), and ARID1B enhancer hijacking by BCL11B20. By contrast, some enhancers were hijacked by many different oncogenes: IGH (5 oncogenes), MIR181A1HG (4, including MIR181A1HG::HOXA13 and MIR181A1HG::LMO2, Extended Data Fig.3l,m), SATB1, N-Me14 and CDK6 (3 each; Fig.2a).
Non-coding alterations also deregulated 15 genes that were not initiating/classifying drivers in 14.6% of cases. These included MYC deregulation from SV of the N-Me enhancer14,15 (8.3%), enhancer SNV/indel mutations resulting in IL7R deregulation (2.2%), ZNF219-HNRNPC intergenic deletion (1.2%), and deletion of the RUNX1 (1.2%), and IRX3 regulatory regions (1.2%; Supplementary Fig.2a-b). Specifically, a recurrent deletion within the FTO locus was associated with increased expression of IRX3 and IRX5 located 240 kB and 890kB downstream, respectively, and the deleted region contains a putative regulatory region for these genes in CD34+ cells (Extended Data Fig.4a). In a second example, ZNF219, positioned 520kb downstream of the TCRδ locus, exhibited recurrent intergenic losses between ZNF219 and the HNRNPC promoter region in conjunction with TCRδ locus deletions, resulting in elevated monoallelic expression of ZNF219 (Extended Data Fig.4b,c, Supplementary Fig.2c).
Several enhancer hijacking events were associated with developmental stage. Intergenic deletions between LMO2 and RAG2 were observed in the TAL1 DP-like subtype, and the RAG2 enhancer that drives LMO2 deregulation was highly active in normal DP and not in CD34+ cells (Fig.2b, Supplementary Fig.3a). Similarly, inversion of TAL1 to the CD1 locus resulted in hijacking of CD1E enhancers, also active in normal DP cells and thymic CD34+CD1a+ cells, but not BM CD34+ cells (Fig.2c, Supplementary Fig.3b). Conversely, a HOXA13-deregulated case in the ETP-like subtype hijacked SOX4 enhancers located within the CASC15 locus that are preferentially active in normal BM/thymic CD34+ cells as compared to DP cells (Fig.2d, Supplementary Fig.3c).
Mutational generation of recurrent neoenhancers was frequent in TAL1 DP/αβ-like ALL (24.7% of cases; TAL1, 4 regions; LMO2, 4 regions; LMO1, 1 region) and also observed i NKX2-1 (N=2) and LYL1 (N=5). We detected recurrent TAL1 enhancer gains (median size 133bp) 28kb downstream of TAL1, enhancer SNV/Indels 20kb downstream of TAL1, SNV/Indels in the first intron of TAL1 and the enhancer Indel upstream of TAL15 all of which were active enhancers in CD34+ and not DP cells (Fig.2f). We validated TAL1 enhancer gains using ATAC-seq, HiChIP and isoform sequencing (isoseq; Fig.2g, Extended Data Fig.4d, Supplementary Fig.2d). Small intronic gains (median size 77 bp) and previously reported SNV/Indels26 of LMO2 were associated with neomorphic promoter generation and non-canonical LMO2 isoform expression (Extended Data Fig.4e, Supplementary Fig.2e). Another mutation hotspot was found 1.8kb upstream of LMO2. ATAC-seq and HiChIP revealed that a 6bp deletion at this enhancer resulted in increased H3K27 acetylation compared to other LMO2 alterations, open chromatin, and heightened expression, consistent with the generation of a neoenhancer (Extended Data Fig.4e, Supplementary Fig.2e).
Oncogene intragenic SV and intronic SNVs
In addition to known coding sequence mutations that result in altered function or SV that impact gene expression, we observed non-coding events and intragenic SV of unknown functional consequence in 50 genes. We detected recurrent NOTCH1 intronic SNVs in 1.6% of cases that triggered alternative splicing of exon 28, which we validated using isoseq, RT-PCR and Sanger sequencing (Fig.2e, Supplementary Fig.4,5). The mutation resulted in increased NOTCH1 activation when compared to wild-type or heterodimerization domain (HD) or sequence rich in proline, glutamic acid, serine, and threonine (PEST) domain mutations, as demonstrated by luciferase assay results (Fig.2f). Alphafold27 predicted an extension of the disordered region between the transmembrane domain and the HD domain without disruption to the HD domain (Supplementary Fig.6). This extension is likely to disrupt signaling or alter the cleavage of the extracellular domain, as described for NOTCH1 tandem duplication28 (Extended Data Fig.4f). In addition, we identified NOTCH1 intragenic losses, culminating in recurrent exon 3-27 and 16-27 deletions on NOTCH1 extracellular domains as well as alternative splicing potentially resulting in the constitutive activation of the NOTCH1 intracellular domain (Extended Data Fig.4g-h). IL7R transcription start site (TSS) loss was observed in 0.6% of cases and was associated with a loss of the mutated allele and also IL7R enhancer hijacking by PRLR29 (Extended Data Fig.4i). In contrast, CCND3 harbored TSS losses in 2% of samples, where the deleted site affected only the long isoform of the gene, but preserved expression of both the mutant and wild type short isoform, suggesting a tumor suppressive function of the long isoform (Extended Data Fig.4j).
TLX3 and NKX2-1 can be subdivided based on genomic profiles
We identified distinct gene expression clusters associated within the TLX3 (TLX3-immature and TLX3 DP-like) and NKX2-1 groups (NKX2-1 TCR and NKX2-1 other; Extended Data Fig.1c). While activation mechanisms were similar, TLX3-immature exhibited a higher incidence of WT1 alterations, NUP214::ABL1 fusions, 16q22.1 losses with CTCF, FLT3 internal tandem duplication (ITD) mutations, and JAK pathway alterations. In contrast, TLX3 DP-like exhibited distinct features like 14q gains, LEF1, and MYB alterations (Extended Data Fig.5a-d, Supplementary Table 23). These groups also differed in TCR rearrangements and ETP-status confirmed by scRNA signature analysis (Extended Data Fig.5e-g).
NKX2-1 TCR exhibited TCR hijacking events and RPL10 mutations, while NKX2-1 other was characterized by chromothripsis leading to BCL11B enhancer hijacking, NFKBIA enhancer hijacking, and NKX2-1 locus CNV alterations, accompanied by MYB TCR rearrangements and gains (Extended Data Fig.5h-j and Extended Data Fig.3f,k, Supplementary Table 24).
Refined classification of TAL1/TAL2/LYL1 and LMO1/LMO2 deregulated T-ALL
We performed comparative analysis of driver mechanisms and co-lesions in TAL1 subtypes. TAL1 αβ-like exhibited a higher frequency of STIL::TAL1 fusions and LMO1 Enhancer SNVs, while TAL1 DP-like had higher frequency of LMO2 TCR rearrangements and RAG2 enhancer hijacking events (Fig.3a, Extended Data Fig.6a, Supplementary Table 25). TAL1 deregulation showed significant co-occurrence with LMO2 or LMO1 activation, whereas LMO2 and LMO1 alterations were mutually exclusive. Similarly, alterations involving TAL1, TAL2, and LYL1 exhibited mutual exclusivity (Extended Data Fig.6b). Although driver genes were often shared between these two subtypes, differences in hijacked enhancers and oncogene activation mechanisms suggests that the maturational arrest state is major determinant of the phenotype. Flow cytometry-based immunophenotype analysis confirmed distinct immunophenotypes of these subtypes (Fig.3c-d, Extended Data Fig.6c-f, Supplementary Table 7).
We detected co-occurring and mutually exclusive genetic co-lesions within TAL1 subtypes, facilitating a refined classification into genetic subtypes (Fig.3b,e, Extended Data Fig.6g-i). TAL1 DP-like was classified into subgroups characterized by: RPL10 mutations, frequent DDX3X and MYB alterations; JAK alteration with frequent IL7R and STAT5B mutations; LEF1 SV/Del or LYL1-altered genetic subgroups; and, a diverse "Other" subgroup featuring an increased frequency of FBXW7, CCND3, TAL2 alterations, and TCRD::MYC. TAL1 αβ-like was subdivided into NOTCH1 wild type with frequent PTEN deletions and PI3K pathway alterations; a group marked by 6q loss; and, an "Other" category with NOTCH1 mutations but lacking 6q loss.
Gene expression profiling unveiled associations between genetic subtypes and oncogenes. For LEF1/LYL1, TAL2 altered, LMO2 γδ-like, or STAG2/LMO2 subtypes TAL1 expression was not distinctive; however TAL1 αβ-like demonstrated nearly exclusively high TAL1 expression (Fig.3e). Similarly, MYC expression was frequently elevated in TAL1 DP-like, possibly linked to a higher incidence of FBXW7 alterations within this group to stabilize MYC30. In contrast, TAL1 αβ-like displayed heightened MYB and MYCN expression, whereas LMO2 γδ-like and STAG2/LMO2 exclusively expressed MYCN and MYC, respectively. MYCN mutations were enriched in LMO2 γδ-like and TAL1 αβ-like, but not within TAL1 DP-like subtype.
Characterization of early T-cell precursor-like ALL
One of the few diagnostic features used to stratify risk in T-ALL is ETP immunophenotype (cytoplasmic CD3+, CD1a-, CD8-, CD5 dim/-, with expression of stem cell or myeloid antigens)7. “Near-ETP” T-ALL cases have similar immunophenotype except positivity for CD5. Prior genomic studies of ETP ALL have identified recurrent alterations of genes encoding regulators of hematopoietic development, kinase signaling and chromatin modification6,8, but have failed to identify unifying genomic alterations that distinguish such cases. Here we identify four subtypes that exhibit enrichment of ETP and near-ETP ALL. The BCL11B-activated subtype was exclusively of ETP immunophenotype19,20 and harbored (13.6%) ETP cases in the cohort. Most strikingly, we observe a group of cases that includes 70.9% of ETP and 41% of near ETP cases in the cohort, but conversely comparable proportions of each immunophenotypic group: 38.2% of cases were ETP, 33.8% Near-ETP and 27.9% non-ETP. This “ETP-like” subtype had multiple recurrent driver alterations of genes with known or putative roles in HSC development: activating rearrangements of HOXA13 (18.7%) to TCR, BCL11B, MIR181A1HG, SATB1, CDK6 enhancers, cases with HOXA9/10/11 deregulation driven by rearrangements of MLLT10 (18.3%), KMT2A (11%), NUP214 (5.1%), NUP98 (3.4%); loss-of-function mutations of MED12 (14.4%); ZFP36L2 (8.9%) rearrangements or alterations of ETV6 (7.2%) (Fig.4a). Notably, KMT2A and MLLT10 rearrangements also define distinct subtypes of T-ALL not enriched for ETP ALL, supporting the notion that both cell of origin and oncogenic driver determine gene expression signatures. KMT2A fusions within the ETP-like subtype harbored mostly KMT2A::AFDN and other fusion partners, whereas the non-ETP KMT2A subtype exclusively had KMT2A::MLLT1 fusions (Extended Data Fig.7a). Similarly, a subset of NUP98- and NUP214-rearranged cases clustered apart from ETP-like NUP98/NUP214-R fusion cases. Each ETP-like driver had distinct patterns of concomitant alterations: 2q alterations in the ETV6 subgroup; RUNX1, JAK, SUZ12, ASXL1 mutations in the ZFP36L2 subgroup (Extended Data Fig.7b, Supplementary Table 26); ETV6, TP53, SATB1, SH2B3 alterations in the HOXA13 subgroup; PSIP1 mutations in cases with MLLT10 fusions; KAT6A and MBNL1 mutations in the KMT2A subgroup, and gains of chromosomes 8, 10 and 19, loss of 5q and mutation of IKZF1 and GATA3 in the MED12 subgroup. ETP-like cases with MLLT10/KMT2A/NUP98/NUP214 driver alterations had a higher frequency of CNVs and alterations of ETV6, GATA3, IKZF1 and RAS signaling, and fewer NOTCH pathway alterations than non-ETP-like cases with these drivers, indicating likely roles of cell of origin, fusion partner and co-lesion in driving gene expression fate. By contrast, near-ETP cases were more dispersed, and enriched in the ETP-like, TLX3 Immature and TAL1 αβ-like subtypes (Extended Data Fig.1h-i).
The MED12 alterations observed in ETP-like cases were observed across the coding region of MED12, suggesting loss of function (Supplementary Fig.7). To test this, we inactivated MED12 using genome editing in the LOUCY (SET::NUP214) cell line (Supplementary Fig.8) that has immunophenotypic similarity to ETP ALL, and observed upregulation of histone deacetylase pathway gene expression (Extended Data Fig.7c). Intersection of these data with the gene expression profile of MED12 ETP-like cases showed common reduced expression of the T-cell differentiation markers CD5 and CD28, and increased expression of the stem cell markers LMO2 and HHEX (Fig.4b, Extended Data Fig.7d-f, Supplementary Table 27-29), indicating loss of MED12 function directly contributes to the immaturity characteristic of ETP-like ALL.
Integrated genomic analysis also elucidated mechanisms driving differential deregulation of specific HOXA genes in ETP-like ALL. Specifically, enhancer hijacking alterations driving HOXA9, but not HOXA13 deregulation such as TCRB::HOXA9 showed that rearrangement breakpoints were always located between two CTCF peaks that demarcate a topologically associating domain (TAD) boundary between the HOXA9 and HOXA13 loci in CD34+ HSPC cells (Fig.4c, Extended Data Fig.7g). By contrast, all breakpoints of rearrangements deregulating HOXA13 were confined to the HOXA13 TAD, thus constraining activation of HOXA9.
Although 27.9% of cases in the ETP-like subtype did not fulfil the immunophenotypic criteria for ETP/near-ETP ALL, they exhibited immunophenotypic trends (lower expression of T-cell expression and expression of myeloid/stem cell markers) and commonalities including absence of TCR rearrangements, similar maturational stage, genomic drivers, and outcome (Fig.4d-i, Extended Data Fig.7h-j, Supplementary Results). Thus, ETP-like ALL is a subtype of ALL with distinct, heterogenerous drivers, a likely HSPC origin but variable diagnostic immunophenotype; genomic classification should replace immunophenotypic classification.
Outcome analysis reveals genomic risk factors associated with refractory disease, relapse and secondary malignancies
We examined genomic features associated with clinical outcome (Supplementary Table 30-39). Positivity for residual disease (RD) (MRD ≥0.01%9) was particularly common in the ETP-like and LMO2 γδ-like subtypes (Fig.5a, Extended Data Fig.8a). We associated subtypes, genetic drivers, co-lesions, broad CNV changes, altered pathways to RD risk (Extended Data Fig.8b). Notably, ETP-like drivers and co-lesions (such as SH2B3, ETV6, NRAS, WT1) were associated with higher RD risk, while TAL1 subtype-related features (LEF1, USP7, PI3K, CCND3) associated with lower RD risk. Additionally, pathways like JAK and RAS were associated with higher RD risk, whereas NOTCH, ribosome, and PI3K lower RD risk.
Examining event-free (EFS), disease-free (DFS) and overall survival (OS), the SPI1 and LMO2 γδ-like subtypes had dismal outcomes, and the NKX2-5, as well as ETP-like KMT2A, MLLT10, HOXA13 genetic subtypes had adverse outcome. The non-ETP-like KMT2A subtype had higher MRD but a very favorable prognosis, unlike KMT2A cases within the ETP-like subtype (Fig.5b). Similarly, non-ETP-like MLLT10 cases had better prognosis compared to ETP-like MLLT10 cases. Analogous patterns emerged for TLX3, where TLX3 DP-like had favorable prognosis and TLX3 Immature exhibited worse outcomes. The ZFP36L2 subgroup had increased rates of high MRD, yet a favorable outcome, highlighting that early poor disease response alone should not be the sole factor driving decisions such as HSC transplant. Notably, heterogeneity was also evident within TAL1 genetic subgroups, as TAL1 DP-like subgroups (‘LEF1/LYL1’ and ‘Other’) were associated with inferior EFS and DFS, and TAL1 αβ-like subgroups (‘Notch wt’ and ‘Other’) were associated with inferior OS, whereas the ‘RPL10’ subgroup had an excellent outcome (Extended data Fig.8c).
Next, we examined associations between genetic variants and outcome (Fig.5c, full variant list in Extended Data Fig.9a). Most NOTCH1 variants had favorable prognosis, traditionally perceived as markers of a favorable prognosis regardless of MRD response (Extended Data Fig.9b). Unexpectedly, NOTCH1 intronic SNV and NOTCH1 intragenic losses associated with worse OS and EFS, respectively (Fig.5c). Further, MYC TCR rearrangements had inferior DFS, whereas MYC enhancer gains and had favorable DFS. PTEN alterations emerged as another poor prognosis feature, as cases with PTEN deletions had markedly worse outcomes compared to other PI3K pathway alterations. Within TAL1 subtype features, LYL1 TCR and LMO2 intergenic losses leading to RAG/CAPRIN1 enhancer hijacking were associated with worse outcomes, in contrast to favorable prognostic markers like 6q Loss and RPL10 mutations. Collectively, these results demonstrate that risk stratification must account for the type of variant for a given gene and not only the gene that is altered.
Through competing risk (CR) models, we identified risk factors such as LMO2 intergenic loss, MYC TCR, PTEN deletions and NOTCH1 intragenic deletions (Extended Data Fig.9c-g) associated with relapse. We found an association of TAL1 upstream indel with a higher relapse risk compared to other TAL1 mechanisms and overall differential relapse risks across TAL1 subtypes (Extended Data Fig.9h-i). In a recent study, TAL1 upstream Indels and TCR::LMO1 were associated with induction failure; however, we found no association in our cohort (Extended Data Fig.9j).
Notably, CR analysis of secondary malignancies showed that 4 out of 11 SPI1 fusion cases developed histiocytosis and myeloid sarcoma within a year of diagnosis (Fig.5d). The T-ALL samples exhibited elevated expression of markers also expressed by dendritic cells (HLA-D, CD1a, CD38, CD45, CD7, CD5, and sCD3) and the SPI1 signature showed high enrichment in thymic dendritic cells and gene expression markers aligned with immunophenotype, suggesting the cell of origin that acquires SPI1 fusion has T and dendritic cell characteristics (Fig.5e-f).
Multivariable genomic models accurately predict patients at risk in T-ALL
We developed multivariable models incorporating clinical variables, treatment response and genetic subtypes and alterations to predict outcome and risk stratify patients (Methods). Random Survival Forest (RSF) and Penalized Cox Regression (pCox) had the highest accuracy when each model was fitted using numeric MRD, clinical variables (sex, WBC count at diagnosis >2x105 cells/μl and central nervous system (CNS) status), subtype/variant level genomic features for the pCox model, and genetic subtype for the RSF model (Extended Data Fig.10a, Supplementary Results, Supplementary Table 40). These approaches are distinguished by their precision and potential clinical utility (Supplementary Results). The pCox model was designed to comprehensively identify combinations of genetic and clinical features that were independently prognostic. The second four node Survival Tree (ST) model, was designed with a specific focus on the practical application of the RSF model in a clinical setting. In this model, patients were stratified into prognostic groups using only their genetic subtypes and MRD status.
The pCox model achieved a concordance of 0.767 and incorporated three clinical features, five subtypes and 18 genomic alterations to stratify patients into four equally sized groups with five year EFS ranging from 65 to 97% (Fig.6a,b, Supplementary Table 41). SPI1 subtype, ETP-like subtypes, PTEN deletions/loss, PIK3CD SNV/indels, and LMO2 intergenic loss associated with worse outcomes, while KMT2A subtype, 6q loss, RPL10 and NOTCH1 SNV/indels were associated with favorable outcomes both in univariable and pCox models, highlighting their value as independent prognostic biomarkers (Fig.6a, Fig.5b-c).
The four node ST model achieved a concordance of 0.712 and was able to risk stratify patients into eight groups with 5-year EFS ranging from 45-98% (Fig.6c, Extended Data Fig.10b). Several features, including the ETP-like drivers KMT2A, MLLT10, NUP98 and rare drivers, and the SPI1, LMO2 γδ-like and NKX2-5 subgroups had poor outcome (5-year EFS <60%) regardless of MRD response. These patients should be considered for HSC transplant or novel immunotherapies as outcomes are poor despite intensive multi-agent chemotherapy. In contrast, several other features, including ETP-like with ZFP36L2 alterations, TLX3 DP-like, TAL1 DP-like RPL10, NKX2-1, TLX1, KMT2A, HOXA9 TCR, TAL1 αβ-like Loss 6q, TAL1 αβ-like Notch wt had very favorable outcomes (5-year EFS >98%) if day 29 MRD was <0.01%. This large group of patients (n = 260; ~20% of cohort) may benefit from a reduction in intensity of chemotherapy.
ETP immunophenotype (IP) was not prognostic in the ETP-like subtype (Extended Data Fig.10f). In contrast, both the pCox model and ST proved effective in accurately predicting outcomes for individuals within both the ETP-like and ETP-IP groups (Fig.6c, Extended Data Fig.10c-j). These findings underscore the necessity of employing genomics-based multivariable prognostic classification.