AI correctly identifies PCNSL molecular groups, MYD88 status, and COO
The characteristics of the different cohorts of PCNSL and LBCL of patients are described in Figure 1, Figure S1 and Table S1-S12. Briefly, the large majority of PCNSL patients were treated with high-dose methotrexate (HD-MTX) regimens. The LBCL from Rouen cohort exclusively involved nodal LBCL. TCGA-DLBCL was a combination of different nodal and extranodal LBCL and this information is not available in the DLBCL-Morph dataset. All LBCL cohorts were treated using R-CHOP or R-CHOP–like immunochemotherapy.
For our main dataset, PCNSL from PSL, a total of 605 775 tiles was generated from annotations of 106 slides from 106 patients with a mean tile count of 5715 (±76) per patient. For the diagnostic experiments, up to 100 randomly selected tiles per patient were used for training and validation (to keep computing times within a reasonable timeframe).
For our deep-learning models, we used as metrics AUROC curves, which shows the ability of the model to distinguish between classes, the accuracy, to have the number of correct predictions in total to the number of predictions made, and the precision-recall curves summarizing the trade-off between the true positive rate and the positive predictive value with different probability thresholds, especially useful when datasets are imbalanced. We also show confusion matrices to have a global vision of the predictions.
Classification accuracy was 74%. The AUROC for all classes combined was 0.88 (±0.07) using macro-averaging (Figure 2A). For each cluster, the AUROC are respectively [0.879, 0.776, 0.921, 0.958]. Precision-recall curves as indicated in Figure 2A. We represented the attention map of the neural network with Grad-CAM to visualize classification results of all subgroups, Figure 2B. The distribution of samples in training and test is shown in Table S10.
Similarly, we used CLAM neural network method for other classification tasks. We used a subset of patients from PSL and Barcelona cohorts with the status of MYD88 L265P available directly using H&E WSI, Table S10-S11. Remarkably, we obtained good classification results with an AUROC of 0.92 [CI 0.57-0.97], Figure 3A. Then, we used another open-source dataset, EBRAINS with a cohort of brain slides from patients harboring no brain cancer to classify PCNSL vs control brains, Table S12. Unsurprisingly, we obtained virtually perfect classification with AUROC of 0.98 [CI 0.75-0.99], Figure 3B. Lastly, we performed different binary survival classifications using with patients having a survival > 12 months with a moderate classification performance with AUROC of 0.69 [CI 0.42-0.79] Figure 3C. Further details are provided in the Figure S2.
Spatial intra-tumor heterogeneity in PCNSL
Next, we wanted to quantify the intratumor molecular heterogeneity in PCNSL. We used two recently published datasets with spatial transcriptomics using 10X Visium technology and single-cell RNA sequencing (scRNA). We used a list with 100 genes per cluster to identify each cluster4. This gene list has been previously published and it is available in Table S14. We applied the recently described Spatial Transcriptomic Analysis using Reference-Free auxiliarY deep generative modeling and Shared Histology (starfysh)27 algorithm that implements a joint modeling of transcriptomic measurements and histology images, to infer different cell states employing spatial transcriptomics. Interestingly, using tumor enriched PCNSL samples Figure 4A-E, we identified a high degree of intra-tumor heterogeneity, because the 4 molecular groups were present in the same sample, Figure 4F. It is important to highlight that high intratumor heterogeneity and the co-presence of the different groups was found in all the analyzed samples, Figure S3. Additionally, we hypothesized that each molecular cluster of PCNSL may be considered as a particular cell state and we were inspired by the work of Neftel et al28 to represent the correlation of each enrichment score to each cluster in several PCNSL samples analyzed by scRNA, Figure 4G. We could confirm our previous results with bulk RNA-seq data4 showing that cluster 4 is the most strongly expressed throughout the different groups. Conversely, cluster 2 was the least frequent, Figure 4G. Further methodological details are provided in the Supplementary Methods.
Radiomics features predicts the survival in LBCL and PCNSL
Finally, we applied a complementary approach using a neural network to automatically segment all the cells within each patch and then extract a broad range of radiomics features in each cell. The results were scaled and averaged per patient and were used to assess different PLS Cox models. For PSL cohort, we obtained a C-index of 0.73 (std 0.03) by taking only clinical features (KPS, age and gender). By taking only radiomics features, a C-index of 0.97 (std 0.01) was obtained; with all the features (clinical and radiomics), the C-index was 0.98 (std 0.01), Figure 5A. We confirmed our results in an independent PCNSL cohort from Barcelona (we obtained the respectively C-indexes for clinical features only (ECOG, age and gender), radiomics features and all the features of 0.68 (std 0.06), 0.98 (std 0.01) and 0.99 (std 0.01), Figure 5A. We generalized our results in systemic LBCL in three different datasets. For Rouen dataset, we obtained a C-index of 0.60 (std 0.08) for clinical features (GC status and age) and C-indexes of 0.96 (std 0.02) for radiomics features only and all features, Figure 5A. For DLBCL-Morph dataset, we obtained a C-index of 0.64 (std 0.05) for clinical features (GC status, RIPI score and age), 0.97 (std 0.01) for radiomics features and 0.98 (std 0.01) for all features, Figure 5A. Finally, for TCGA dataset, we obtained respectively C-index of 0.55 (std 0.14) (age and gender), 0.96 (std 0.02) and 0.96 (std 0.02), Figure 5A.
The weight of each PLS Cox component was represented according to the 4 clusters of PCNSL, Figure 5B and Figure S4A. All the PLS Cox components were enriched with a broad range of different radiomics features, but there was an overrepresentation of wavelet transformed radiomics within the most relevant ones, Figure 5C and Figure S4B. Interestingly, the 15 PLS Cox components were statistically significant in the PLS Cox multivariate analysis as shown in the forestplot, Figure 6A. Finally, to deepen in the interpretability of these PLS Cox components, we have correlated them with the immune cell deconvolution obtained from bulk RNA-seq. Interestingly, the component 1 was strongly positively correlated with CD4 T cell infiltration and negatively with NK cells and CD8, Figure 6B. Inversely, component 2 was positively correlated with CD8 T cell infiltration, among others, Figure 6B. These results suggest that our cell-based radiomic analysis may capture the different TME network associations and their impact on PCNSL.