A daily dilemma in dentistry and oral surgery is to determine whether a third molar should be removed or not. In cases of diseased third molar, where pain or pathology are obvious, there is a general consensus that surgical removal is indicated[3]. Improved diagnostics, e.g. on OPG(s), might improve the selection process whether to remove or not. A more stringent indication pathway may reduce millions of unnecessary third molar removals every year, so reducing comorbidity and health costs[13].
This pilot study assesses the capability of a deep learning model (MobileNet V2) to detect carious third molars on OPG(s) and is therefore a mosaic stone in the picture of automation of M3 removal diagnostics. Caries detection on third molars using OPG(s) is flawed by limited and varying accuracy of individual examiners leading to inconsistent decisions and consequently suboptimal care[8]. The use of deep neural networks might bring us a more reliable, faster and reproducible way of diagnosing pathology, and can therefore reduce the number of unnecessary third molar removals [14]. As previously stated, the assessment of third molars on OPG(s) is additionally interesting, as radiation exposure can be reduced when additional radiographs are avoided.
In dental radiology, previous studies applied deep learning models for caries detection on different image modalities[7–12, 15]. Lee JH et al. applied a pre-trained GoogLeNet Inception v3 CNN network on periapical radiographs achieving accuracies up to 0.89[7]. Casalegno et al. applied U-net with VGG-16 as an encoder, on near-infrared transillumination images (NITS)[9] with a reported AUC between 0.836 and 0.856. ResNet-18 and ResNext-50 were applied by Schwendicke et al. on NITS[10]. The reported AUC ranged from 0.730 to 0.856 in these studies. Two other studies explored the caries detection on clinical photos using Mask R-CNN with ResNet, reporting an accuracy of 0.870[12] and a F1-score of 0.889[11]. Finally, U-net with EfficientNet-B5 as an encoder was used to segment caries on bitewings with an accuracy of 0.8[8]. It is important to note that the performance of the deep learning models are highly dependent on the dataset, the hyperparameters, the image modality and the architecture itself[6, 15]. As these parameters differed between the studies, a direct comparison of these studies would be misleading.
In this study, an accuracy of 0.87 and an AUC of 0.90 was achieved for caries detection on third molars on OPG(s). In comparison, previous studies have stated an AUC of 0.768 for caries detection on OPG(s) by clinicians[16]. Several factors are associated with the model performance. Firstly, the use of depthwise separable convolutions and the inverted residual with linear bottleneck reduced the number of parameters and the amount of memory constraint while retaining a high accuracy[17]. These characteristics make the MobileNet V2 less prone to overfitting. Overfitting is a modelling error that occurs when a good fit is achieved on the training data, while the generalization of the model on unseen data is unreliable. Secondly, a histogram equalization was applied on the OPG(s) as a pre-processing step. Histogram equalization is a method for adjusting image intensities to enhance the contrast and this can increase the prediction accuracy[18]. Lastly, transfer learning was used to prevent overfitting. Transfer learning is a technique that pre-trains very deep networks on large datasets in order to learn the generic and low-level features in the early layers of the network. By reusing these learned weights on other tasks, the need to re-learn these low-level features in new data sets is eliminated, which greatly reduces the amount of data and timed required to converge such a deep neural network[19].
A limitation of the present study is that only cropped images of third molars were included. Training and testing the model with cropped premolars, incisors and canines might further increase the robustness and the generalizability to assess all caries on OGP(s). Secondly, the clinical and radiological assessment by surgeons is not the gold standard in detection of caries. Histological confirmations of caries and further extension of labeled data are required, to overcome the model’s limits in this present study.
To the best of our knowledge, this is the first publication to rely deep learning using solely OPG(s) for caries detection on third molars. Furthermore, class activation maps are generated to increase the interpretability of the model predictions. Considering the encouraging results, future work should reside on the detection of other pathologies associated with third molars such as pericoronitis, periapical lesions, root resorption or cysts. Also, the potential bias in these algorithms with possible risks of limited robustness, generalizability and reproducibility has to be assessed in future studies using external datasets and is a necessary step to further implement deep learning successfully in daily clinical practice.
In conclusion, a convolutional neural network (CNN) was developed that achieved a F1 score of 0.87 for caries detection on third molars using panoramic radiographs. This forms a promising foundation for the further development of automatic third molar removal assessment.