Nasopharyngeal Carcinoma (NPC) is a worldwide malignant epithelial cancer. The GLOBOCAN 2020 estimates that there are approximately 133,354 new NPC cases and 80,008 NPC-related deaths worldwide in 2020 (according to the International Agency for Research on Cancer (IARC)) [1]. NPC mainly arises from the nasopharynx epithelium, especially the fossa of Rosenmuller [2], and can be pathologically divided into keratinizing differentiated tumor, non-keratinizing differentiated tumor, and non-keratinizing undifferentiated tumor. Due to its unique anatomical structure and radiosensitivity [3, 4], the primary therapeutic regimen for NPC is radiotherapy (RT), with or without chemotherapy, targeted therapy, and immunotherapy. Survival prediction is a major concern in clinical cancer research, including the NPC, as it provides early prognostic information that is needed to guide the therapeutic regimen. In clinical practice, the Tumor, Node, Metastasis (TNM) stage is widely used as an indicator for survival prediction according to the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) staging system. However, for patients classified into the same TNM stage, their prognoses may differ widely and their 5-year survival rates range from 10–40% for advanced NPC [5–7]. This may be attributed to the fact that TNM stage only takes into account of the anatomical information, e.g., size, number, border, and location. Therefore, TNM stage can provide limited benefit for prognoses in patients with advanced NPC.
In addition to TNM stage, many clinical biomarkers, such as age, serum lactate dehydrogenase (LDH), and body mass index (BMI), have also been reported as individual prognostic indicators for survival prediction in advanced NPC [8–10]. However, these indicators are not specifically relevant to the disease and can be influenced by other indicators, thus failing in repeatability and practicability [11, 12]. Non-invasive image-derived biomarkers have also shown good prognostic performance for survival prediction in advanced NPC [13–15]. However, conventional imaging modalities, such as with Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), only provide tumor’s anatomical information. Multi-modality imaging of Positron Emission Tomography/Computed Tomography (PET /CT) provides both the tumor’s anatomical (from CT) and metabolic (from PET) information. However, conventional indicators derived from PET/CT, including Standardized uptake Value (SUV), Metabolic Tumor Volume (MTV), and Total Lesion Glycolysis (TLG), failed to represent intra-tumor information such as tumor texture, intensity, heterogeneity, and morphology [16, 14, 13, 17]. Therefore, prognostic indicators better representing tumor characteristics, especially intra-tumor information, are in need for more accurate survival prediction.
Radiomics, as a widely recognized computational method for prognosis, exploits quantitative features (indicators) extracted from medical images to represent tumor characteristics [18]. It has drawn much attention among clinical oncologists due to its ability to provide comprehensive representations of tumor characteristics, including intra-tumor information [19]. Conventional Radiomics (CR) refers to the extraction and analysis of high-dimensional handcrafted features from medical images. Through high-throughput feature extraction and statistical machine-learning methods, CR can extract and analyze tumor characteristics and has been widely used in many clinical applications [19]. In an early study, Zhang et al. [20] performed CR-based prediction of local failure and distant failure in advanced NPC from MRI images. They experimented with 54 cross-combinations derived from 6 feature selection methods and 9 classification methods, and identified optimal combinations in terms of Area Under the receiver-operating characteristic Curve (AUC) and testing errors. In a later study, Du et al. [21] used a similar analytic scheme to identify the optimal CR methods for deafferenting local recurrence versus inflammation in NPC from PET/CT images. CR has also been studied in other cancers, such as head and neck cancer [22] and lung cancer [23]. These studies demonstrated the capabilities of CR for prognosis and identified the optimal CR methods for their clinical targets through comprehensive comparisons. However, since CR is heavily dependent on human prior knowledge, such as handcrafted feature extraction and manual tuning of many model parameters, its limitations in bringing a source of human bias and lacking the ability to understand high-level semantic information have been well recognized [24, 25].
Recently, deep learning, that leverages deep neural networks to learn deep representations (features) of patterns within images, has achieved great success on medical image analysis and inspired trends toward Deep Learning-based Radiomics (DLR) [26]. Unlike CR normally consisting of 4 separate steps (Fig. 1a), DLR removes the reliance on using segmented Regions of Interest (ROIs), and their feature extraction and analysis are partially or fully coupled (Fig. 1b). DLR usually adopts a deep neural network to directly predict patients’ outcomes from medical images in an end-to-end manner, thereby removing the reliance on time-consuming handcrafted feature extraction and allowing for automatic learning of relevant and robust features without the need for human intervention [24]. In other words, DLR can remove the human bias brought by handcrafted features and potentially discover high-level semantic features that may be overlooked by manually-defined features.
DLR has been widely used in the studies of many cancers including glioma [27], lung cancer [28], breast cancer [29], renal tumor [30], and NPC [31–34]. Peng et al. [31] proposed one of the earliest studies where DLR was introduced into the prognosis of NPC. They used a pre-trained 2D Convolutional Neural Network (CNN) to extract deep features from PET and CT images separately, and then fed the deep features, as well as conventional handcrafted features, into a Cox Proportional Hazards (CPH) model to establish a prognostic nomogram. Their study suggested that deep features can serve as reliable and powerful indicators for prognosis. However, their 2D CNN extracted deep features from only one single tumor slice instead of the entire tumor volume, and therefore discarded important information residing in the entire tumor e.g., volumetric information [26]. Besides, their CNN was only used to extract deep features rather than to make predictions directly. This also undermines the advantages of DLR because end-to-end DLR models are considered more effective to extract relevant features [34]. In later studies, Zhang et al. [33] and Jing et al. [34] implemented end-to-end DLR models to predict the metastasis and progression of NPC from MRI images. Zhang et al. [33] still relied on a 2D CNN that takes into account of at most three tumor slices, while Jing et al. [34] used a 3D CNN to make prediction based on the entire tumor volume. Zhang et al. [33]’s and Jing et al. [34]’s DLR models both achieved higher prognostic accuracy than CR methods, they, however, only used single-modality MRI images. Existing studies have demonstrated that anatomical CT or MRI-based radiomics [22, 35, 34] showed relatively lower prognostic performance than multi-modality PET/CT-based radiomics [31, 36]. These studies suggested that PET/CT images containing both anatomical and metabolic information may be more promising for achieving higher prognostic performance. However, from our review, there is no study that attempted to develop end-to-end DLR models for predicting the prognoses of NPC patients from PET/CT images. In addition, existing DLR studies [27–30, 32–34] are further limited by: (1) they were mainly designed for single imaging modality such as MRI and CT, so their DLR models cannot derive complementary features from multi-modality PET/CT images; and (2) they had limited comparison to the CR methods (e.g., only a few CR methods were chosen for comparison), which undermines the reliability of their conclusions.
In this study, we investigated DLR for long-time (5-year) survival prediction in 170 patients with advanced NPC using pretreatment PET/CT images. We employed an end-to-end multi-modality DLR model to directly predict 5-year Progression-Free Survival (PFS) from pretreatment PET/CT images. Our DLR model is a 3D CNN that was purposely optimized for multi-modality PET/CT images and can simultaneously extract complementary deep features from both PET and CT images. Our DLR model can integrate TNM stage as a high-level clinical feature and this has been demonstrated to further improve prognostic performance. In the experiments, we first followed the analytic scheme proposed by Zhang et al. [20] to identify the optimal CR methods as benchmarks, and then compared our DLR model with these benchmarks for prognostic performance. Furthermore, we also established single-modality DLR models using only PET or CT for comparison. Our results demonstrated that DLR can outperform CR in survival prediction of NPC, and that the multi-modality DLR model can achieve higher prognostic performance than the single-modality DLR models.