Assessment of cytologically indeterminate nodules is still a critical challenge for thyroidologists who need to balance the risk of cancer misdiagnosis and that of overtreatment with associated (potential) side effects and costs.
Molecular imaging using 18FDG provides useful data on biological behavior and aggressiveness of thyroid nodules [12]. In particular, available data consistently prove high diagnostic performance of visually interpreted 18FDG PET/CT in cytologically indeterminate nodules with pooled sensitivity and NPV values of 95% and 96%, respectively [13]. Indeed, De Geus-Oei et al. and Sebastianes et al. argued that preoperative 18FDG PET/CT could reduce the number of futile hemithyroidectomies by 66% and 39%, respectively [14, 15].
More recently, the 5-year cost-effectiveness of routine 18FDG PET/CT in patients with cytologically indeterminate thyroid nodules was also assessed. Routine 18FDG PET/CT could prevent 47% of inappropriate diagnostic lobectomies, reducing costs and increasing patients’ quality of life. Moreover, 18FDG PET/CT compared favorably with a gene expression classifier test and molecular marker panel [16]. All considered, a visually 18FDG-negative nodule carries a very low risk of malignancy but the accuracy of 18FDG PET/CT is limited in 18FDG-avid nodules [17].
A different approach to 18FDG PET/CT evaluation has been recently proposed based on the assessment of 18FDG distribution within thyroid nodules. Preliminary data suggested that metabolic heterogeneity might differentiate benign from malignant nodules more accurately than conventional PET metrics [18]. Sollini et al. reported interesting results by evaluating histogram-based and matrix-based features with textural analysis [19, 20]. In a previous study [5], Ceriani et al. demonstrated good accuracy of a model incorporating PET metrics and RF in distinguishing benign from malignant TIs. Briefly, shape_Sphericity was the best predictor, classifying 82% of TIs correctly. Moreover, TLG, SUVmax, and shape_Sphericity retained statistical significance in a multivariate analysis, and malignancy rate increased from 7–100% in accordance with the number of positive parameters present. To the best of our knowledge, our study is the first attempt to integrate textural features and PET metrics in a multiparametric predictive model for cytologically indeterminate thyroid nodules. In such a clinical challenging population, we identified a radiomic signature including shape_Sphericity and glcm_Autocorrelation as an effective tool for discriminating benign from malignant lesions. In particular, we demonstrated that a predictive model combining RS and Bethesda classes accurately stratified the risk of malignancy of cytologically indeterminate and 18FDG-avid thyroid nodules.
Notably, the prevalence of malignancy may significantly differ in class III and IV nodules, depending by the prevalence of malignancies in the local population and expertise of cytopathologist and in our series the prevalence of malignancy was consistently higher in class IV than class III. Notwithstanding, as the main result of our study, radiomic signatures retained independency in multivariate analysis and the addition of PET/CT radiomics in a multiparameter model refined the diagnostic accuracy in both categories.
Separate analyses were performed with and without Hürthle cell lesions as high 18FDG uptake is observed in both benign and malignant Hürthle cell nodules, and higher SUVmax values are generally recorded in Hürthle cell adenoma than in other benign nodules [21]. Interestingly, our model retained a good discriminating performance even when Hürthle cell lesions were included (PPV 79% and 58% without and with Hürthle cell lesions, respectively). Notably, a score of 0 excluded malignancy with 96% NPV even in Hürthle cell lesions, which represents a significant improvement compared with visual assessment of PET/CT with or without SUVmax incorporation. Compared with a previous patient population in whom 18FDG-avid TIs were incidentally detected during PET/CT [5], some differences were found in the current analysis. First, standard PET metrics features describing glucose metabolic rate were not able to discriminate benign from malignant lesions. Second, the lesion shape (i.e. sphericity) remained relevant when combined with a texture feature describing the intra-lesion heterogeneity. Such differences are likely due to subtle metabolic differences in benign and malignant follicular-patterned lesions, making conventional parameters, such as SUVmax, less relevant. Malignant lesions are histologically characterised by capsule and vascular invasion and apoptosis/necrosis potentially resulting in shape distortion and inhomogeneous tissue structure, making the role of textural features preeminent in differentiating benign from malignant follicular-patterned lesions.
Our retrospective, multicentre study has some limitations. First, a validated threshold value to segment 18FDG-avid thyroid nodules has not yet been defined. However, our approach was based on arbitrary selection of the SUVmean of the contralateral lobe to define the actual nodules’ volume independently of their metabolic activity and thus increase accuracy and reproducibility of radiomics analysis.
Second, we analysed only lesions larger than 10 mm; thus, our findings may not be applicable to smaller nodules. On the other hand, current recommendation is to not perform fine-needle aspiration cytology in nodules, including 18FDG-avid nodules, less than 10 mm in largest diameter [22].
Third, while up to 25% of all thyroid nodules FNAC result in indeterminate results, a relatively small series of patients was enrolled in our study. It should be noted, however, that strict criteria for FNAC are applied in our centers, reducing the number of examinations and, especially, we only included 18FDG-avid nodules and postoperative histological diagnosis was mandatory for inclusion.
Fourth, some studies suggest that the AUS/FLUS category should be further subdivided into AUS with cytologic atypia (higher risk for malignancy) and FLUS with architectural atypia (lower risk for malignancy) [23]. However, this approach has not yet been widely adopted in clinical practice and we cannot evaluate the potential impact of subdividing AUS and FLUS classes in our patients.
Fifth, we did not compare PET/CT data with ultrasound (US) and molecular biomarkers.
Ultrasound is one of the principal steps in the initial workup of thyroid nodules and different risk stratification systems are now recommend FNAC dependent on nodule size and various combinations of US characteristics with an incremental risk of malignancy [22]. However, US remains an operator-dependent procedure and a reliable comparison of US results was precluded in our study as US examinations were performed by different sonographers in different centres. Additionally, despite some authors support the use of ultrasound to risk stratify nodules with indeterminate cytology no clear recommendations are provided regarding (re)interpretation of US characteristics after FNAC has resulted in indeterminate cytology [24–25]. Notably, currently available US risk stratification systems have been evaluated against papillary carcinoma while follicular-type malignancies typically have a different US appearance and caution was advised when using US to evaluate follicular-patterned lesion and capture FTCs [26].
Three different approaches characterize thyroid molecular tests [27]. One aims to exclude (rule-out) and the other aims to confirm (rule-in) malignancy in the indeterminate category. The rule out test is offered by the Veracyte company and consists in the evaluation of several mRNA to optimize NPV and is called Afirma gene expression classifier (GEC). The rule-in test is developed by the University of Pittsburg and commercialized by CBLPath and known as ThyroSEq. It is based on next generation sequencing (NGS) for point mutations and gene fusions in known thyroid cancer related genes. Another test available and belonging to the rule-in tests is the ThyGenX-ThyraMIR commercialized by Interpace Diagnostics. Such tests are currently not available in Europe, even in referral centers, due to the high costs.
A different approach, available in many referral centers also in Europe, is based on gene mutations analysis mutational panel (BRAF, H-N-K-RAS, RET/PTC PAX8/PPAR-gamma). On the other hand, BRAF mutation analysis is 100% predictive of papillary thyroid carcinoma (high PPV), but most cancers are BRAF-negative (very low NPV); mutations of RAS-family genes mutations are also observed in follicular adenomas and The prevalence of PAX8/PPARγ rearrangements is generally limited in IC nodules with no cases reported in some studies.
Finally, all methods largely depend on local cancer prevalence and pretest probabilities, and no definitive guidelines exist, consequently a locally adapted multimodality stepwise approach, ideally combining one rule-in and one rule-out test, likely offers the most accurate diagnosis [2]. Accordingly, the potential improvement generated by the integration of PET/CT radiomics, molecular and/or other imaging biomarkers certainly deserves to be further explored, and validation of the current model in a prospective study including a larger number of cases is warranted to confirm the present results.
We demonstrated that the combination of PET/CT-based radiomic signature and Bethesda classes (i.e. III vs IV) in a predictive model increases the accuracy of risk stratification compared to Bethesda system and PET/CT alone. This combined approach may reduce the number of diagnostic lobectomies, with associated advantages in terms of costs and quality of life for patients.