2.1. Patients’ cohort
We collected the clinical data of patients diagnosed as esophageal squamous cell carcinoma (ESCC) and received concurrent chemoradiotherapy (CCRT) in our hospital during the period from January 2013 to December 2015. Patients were excluded if they met the exclusion criteria as followed: 1) Patients received esophagectomy and preoperative or postoperative adjuvant radiotherapy; 2) Patients had distant metastatic disease; 3) Patients received low-dose (< 50Gy) palliative radiotherapy; 4) Clinicopathological information of the patients was incomplete; 5) Patients was diagnosed as esophageal fistula before treatment; 6) Poor visualization quality due to image artifacts or the tumor was too small to be recognized on CT images; 7) Patients had other primary tumor; 8) Patients died within three months after chemoradiotherapy.
After multiple iterations, a total of 221 patients were randomly divided into two groups, with 155 patients in the training cohort and 66 patients in the validation cohort. To improve the generalization property of the result, multi-factors stratification was used to keep the characteristics of sub cohort consistent with the whole cohort. The process of patients’ enrollment and randomization were shown in Fig. 1. This study was approved by the Institutional Committee of our hospital on Human Rights. Disease of the patients was staged according to the 8th edition of AJCC TNM classification for esophageal cancer(16).
2.2. Chemoradiotherapy protocol
Radiotherapy was delivered daily to patients with three-dimensional conformal radiation therapy (3DRT) or intensity-modulated radiation therapy (IMRT) technique using a Varian IX or Varian 23EX linear accelerator in this study. The gross tumor volume (GTV) includes the esophageal cancer (GTVp) and the positive regional lymph nodes (GTVnd). The GTV was delineated on CT imaging according to barium esophagogram, endoscopic examination or PET imaging. The CTV is defined as the GTVp with 0.5 to 1 cm radial expansion and 2.5 to 3 cm axial direction expansion or the GTVnd with 0.5 to 0.8 cm uniform expansion. The planning target volume (PTV) was defined as CTV with a 1cm uniform expansion. A total prescribed dose of 50–72 Gy (median, 64Gy) in conventional fractionation was delivered to the patients.
Two cycles of platinum-based chemotherapy were administered concurrently with radiotherapy. Sixty-one patients received TP (paclitaxel + cisplatin) chemotherapy every three weeks, which consists of cisplatin (60 mg/m2 on Day 1) plus paclitaxel (135–180 mg/m2 on Days 1). One hundred and sixty patients received the PF (cisplatin + fluorouracil) regimen every four weeks, which consists of cisplatin (60 mg/m2 on Day 1) and fluorouracil (750 mg/m2 /24 h on Days 1 to 4).
2.3. Response evaluation
The response to chemoradiotherpy was evaluated one month after CRT according to the criteria of short-term response evaluation standard on esophageal cancer using CT images and barium esophagogram. According to the response evaluation criteria, clinical response was classified as complete response (CR), partial response (PR), no response (NR), or progressive disease (PD). Patients who was classified as CR by barium esophagogram and had the maximal esophageal wall thickness of ≤ 1.2 cm and the volumes of residual lymph nodes of ≤ 1.0 cm3 on CT were finally defined as CR(17).
2.4. Radiomic feature extraction
All patients were scanned using GE Lightspeed 64-slice spiral CT (GE Medical systems,Milwaukee༌Wis) before radiotherapy. CT image acquisition was performed according to the following acquisition protocol: The CT tube voltage was 120 kV and the tube current was 120 mAs. Rack rotation time: 0.6s; Detector collimation parameters: 64×0.625mm; field of view (FOV): 400-500mm; Matrix: 512×512; Layer thickness is 5mm, layer spacing is 5mm. Contrast medium was injected with a high-pressure syringe at a flow rate of 3.0 ml/s (1-1.5ml/kg, ioproxamine injection 300), followed by 30 to 40 ml of normal saline for flushing, and late arterial CT images were collected with a delay of 30s. To reduce the variability between images from different patients, all images were resampled to voxel of 1*1*1mm3.
3D Slicer (version, 4.10.2, Stable Release) with radiomics extension was used for image segmentation to obtain volume of interest (VOIs). The primary tumor volume (GTV) delineated by radiation oncologists for radiotherapy treatment planning design was defined as VOI for radiomics features extraction. Any pixel with an attenuation of less than − 50 HU was excluded to avoid adjacent air, fat, blood vessels and surrounding organs. Image segmentation was performed independently by a radiation oncologist and another radiologist. To assess the reproducibility of the radiomics features extraction, tumor segmentation was performed again two months later by the same radiologist in 30 randomly chosen patients.
Pyradiomics V3.6.2 was used to extract radiomic features from delineated VOIs. Several category features were extracted from VOIs, including first order statistics features (IH, intensity histogram), shape-based histogram features, and texture features (gray-level co-occurrence matrix, GLCM; gray-level size-zone matrix, GLSZM; gray-level run-length matrix, GLRLM; neighboring gray-tone difference matrix, NGTDM; and gray-level dependence matrix, GLDM). The wavelet filter was used in image pre-processing for texture features extraction. In all, for each VOI, 107 original features (Supplemental Table 1) and 744 wavelet features (Supplemental Table 1) were collected. Among 107 original features, there were 18 first order statistics features, 14 shape-based histogram features, 24 GLCM features, 14 GLDM features, 16 GLRLM features、16 GLSZM features and 5 NGTDM features. Mathematical definitions of these radiomic features have previously been described(18) and available at https://pyradiomics.readthedocs.io/en/latest/features.html.
2.5. Statistical analysis
At the first, statistical analyses were performed with Chi-squared test or Fisher’s to assess the difference of the clinical characteristics between training cohort and validation cohort. A P-value of < 0.05 was considered statistically significant.
In the pre-processing of radiomics features, all the radiomic features values were normalized using Z-score normalization, which made feature values lie within similar ranges and reduced the influence large discrete features values. The intra-class correlation coefficient (ICC) analysis was performed to evaluate the reproducibility of each radiomics feature. Only the features with ICCs value ≥ 0.900 were selected for further analysis. Then, the least absolute shrinkage and selection operator (LASSO) with COX regression was performed using R software version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria) to identify the features associated with LPFS in the training cohort. The optimal parameter lambda (λ) was chosen from the LASSO model using 10-fold cross-validation with the minimum partial likelihood deviance. Radiomics feature score (Rad score) for each patient was built based on the LASSO COX regression model in the training cohort. The LASSO COX regression formula:
Rad score = β1X1 + β2X2 + β3X3+…+βnXn
In the above formula, X1, X2 … Xn are the different radiomics features identified by the LASSO COX regression model, and β1, β2 … βn are the regression coefficients of the corresponding feature in the regression model.
Univariate analysis was performed to identify the potential prognostic factors associated with LPFS. Multivariable COX regression analysis was performed to identify the independently predictors for LPFS. A nomogram model combined Rad-score and clinical factors for predicting LPFS was developed and validated based on the results of multivariable COX regression analysis using rms package and foreign package in R software. The predictive performance of the nomogram model was assessed using Calibration curve validation in both training cohort and validation cohort. All the analyses were performed with R software version 3.6.2.