Study subjects
We retrospectively reviewed data of PTC patients at two institutes between January 2018 and January 2020. The inclusion criteria were: (i) An adult patient over 18 years old, who has not received surgery and antitumor treatment, (ii) PTC was confirmed by surgical pathology and LN surgical pathology was obtained by cervical LN dissection, (iii) preoperative noncontrast and contrast-enhanced CT scans were performed within 2 weeks prior to surgery and biopsy. The exclusion criteria were as follows: (i) incomplete clinical and imaging data, (ii) unclear LN display due to image quality or other problems.
In terms of our previous study and other reports [25–28], the following inclusion criteria were established for metastatic LNs: (i) no other definite cervical LN lesion, such as tuberculosis and lymphoma; (ii) selected from the cervical LN group with two LN metastases confirmed by pathology; (iii) short diameter of LN ≥ 5 mm; (iiii) the LN with highest score and ≥ two points (the classic signs of PTC LN metastasis including maximum short diameter, short diameter/long diameter ≥ 1/2, highest enhancement, cystic degeneration/necrosis, and microcalcification were assigned one point, respectively). The inclusion criteria for non-metastatic LNs met the following three points: (i) no other definite cervical LN lesion, such as tuberculosis and lymphoma; (ii) selected from the cervical LN groups with no LN metastasis confirmed by pathology; (iii) the LN with largest diameter and a short diameter ≥ 5 mm within the groups mentioned in point (ii).
Ct Examinations
Each patient underwent noncontrast and contrast-enhanced CT examinations in the supine position with a 16-layer CT scanner (Institute 1: Lightspeed, GE, United States; Institute 2: Siemens Healthineers, German). The scanning range was from oropharynx to superior clavicle. The contrast agent (Institute 1: Bayer, German; Institute 2: Yangtze River, China) was injected intravenously through the elbow with a high-pressure syringe. The specific parameters of the two institutes were listed in the Table 1.
Table 1
CT scan parameters of the two institutes
Parameters | Institute 1 | Institute 2 |
Tube voltage(kV) | 120 | 120 |
Tube current (mA) | 250 | 250 |
Collimation (mm) | 0.625 × l6 | 0.625 × l6 |
Pitch | 0.938 | 1.1 |
Rack rotation time (s) | 0.5 | 0.5 |
Slice thickness (mm) | 3.75 | 3.0 |
Contrast agent dose (mL) | 50–60 | 70–80 |
Iodine concentration (mgI/mL) | 370 | 300 |
Injection rate (mL/s) | 1.5-2.0 | 2.5-3.0 |
Contrast-enhanced CT time (s) | 50 | 50 |
Note: Contrast-enhanced CT time: Contrast-enhanced CT scan was performed the time after contrast injection. |
Ln Segmentation
The noncontrast and contrast-enhanced CT images of patients were retrieved from the picture archiving and communication system and imported to an open volumetric image analysis platform 3DQI (developed by the 3D quantitative imaging laboratory at Massachusetts General Hospital and Harvard Medical School). It facilitated data loading, segmentation, feature calculation, feature selection and building of the radiomic model. One radiologist (P.W. with 5 years of experience) manually contoured the target LN in the contrast-enhanced CT image, slice by slice, and used them as the standard to delineate the LN at the same level on the noncontrast CT scan to maintain consistency with the contrast-enhanced image, while exercising caution to avoid the surrounding blood vessels, calcification, peripheral fat, and other non-LN tissues. The corresponding sagittal and coronal planes of the LN were referenced when it was ambiguous in the axial plane. The segmentation results were checked by a senior head and neck radiologist (Z.H. with 18 years of experience in head and neck radiology). Any disagreement between the two radiologists was resolved by discussion and consensus. Both radiologists were blinded to the pathological assessment of LN after surgery.
Radiomic Features Extraction And Analysis
Radiomic feature extraction and analysis were carried out on the 3DQI platform. Five categories of volumetric textures were calculated for the segmented LN including 11 shape features, 25 histogram statistical textures, 22 gray level co-occurrence matrix (GLCM) textures, 16 gray-level run-length matrix (GLRLM) textures, and 14 gray level zone size matrix (GLZSM) textures. A 3D, discrete, and single-stage wavelet transform was used to decompose volumetric images into eight decomposed volumes of images, labeled as LLL, LLH, LHL, LHH, HLL, HLH, HHL, and HHH, where L and H are low- and high-frequency signals, respectively. In the eight decomposed volumes of images, 3DQI calculated four categories of volumetric textures except for the shape features in the segmented LN. This resulted in a total of 704 texture features for each segmented LN, including 88 volumetric textures plus 616 (8 × 77) wavelet-transformed textures in one scan. CT images of each LN included two phases of pre-contrast and post-contrast, which were used to calculate the texture features of the two phases respectively; a total of 1408 texture features.
Important features that would be useful for classification and image recognition were selected from a large number of texture features for modeling. We adopted a two-round feature selection method to select important features for the classification of benign and malignant LNs. First, the importance scores calculated by the Boruta algorithm were used for a rapid reduction of texture dimensionality [29]. The Boruta algorithm is a feature ranking and selection algorithm based on the random forest algorithm, which identifies all features which are either strongly or weakly related to the decision variable. Non-relevant features were rejected using a Z-score cutoff of less than 0.01. During the second round, an iterative culling-out algorithm was used to refine the performance of a classifier; an RF model was used [30]. In each iteration, we calculated the classification performance of the model by removing one of the textures, which characterize the area under the receiver operating characteristic (ROC) curves (AUC). If an AUC value, using one less texture parameter, was higher than that of the current model, the model with the maximum AUC value was selected. This iteration continued until the current model had the highest AUC.
We compared four machine-learning classification algorithms, including random forest (RF), support vector machine (SVM), neural network (NN), and naïve bayes (NB) and determined the optimal algorithm for the radiomic model. The performance of the model was measured by AUC. We used the 10-fold cross-validation to validate the model during training. For the 10-fold cross-validation, the entire dataset was randomly divided into 10 subsets. Nine subsets were used for training the classification model. Subsequently, the trained models were tested with the subset that was not used for training. This procedure was repeated for each subset until all the subsets had been tested. This 10-fold cross-validation was repeated 10 times to optimize and stabilize the performance of the models. In addition, the built models were tested with the external dataset from institute 2. We compared the performances of the models using AUC, sensitivity, specificity, and accuracy on the classification of benign and malignant LNs.
Statistical analysis
The statistical methods used in feature selection and model construction were provided by 3DQI software. SPSS 22.0 was used to analyze general data. The Kolmogorov-Smirnov test was used to test the normality of age and mean ± standard deviation (SD) were used for the normal distribution. In addition, the t-test was performed. The median (interquartile range) was used for non-normal distribution, and the Mann-Whitney test was adopted. Pearson’s chi-square test or Fisher’s exact test was used to compare the gender differences between the two groups. P < 0.05 was statistically significant.