In this study, the DTL model achieved robust performance in abnormal fundus image detection, and the AUC, sensitivity, accuracy, and specificity of the DTL were 0.926, 88.17%, 87.18%, and 86.67%, respectively, in an independent subset of the test dataset.
AI-based automated detection of retinal diseases using deep learning and transfer learning systems has been reported in several studies. The initial focus was on deep learning technology. Ting et al. [14] validated their deep learning system (DLS) using 494,661 retinal images, demonstrating that the DLS had high sensitivity and specificity for identifying diabetic retinopathy and related eye diseases for the detection of any DR (AUC = 0.94–0.96); for possible glaucoma, the AUC was 0.942; for AMD, the AUC was 0.931. Similarly, Li et al[15] described the development and validation of an artificial intelligence-based method in 71,043 retinal images acquired from a web-based, deep learning algorithm for the detection of referable diabetic retinopathy. Testing against the independent multiethnic dataset achieved an AUC, sensitivity, and specificity of 0.955, 92.5%, and 98.5%, respectively. Stevenson et al.[16] showed their proof-of-concept AI system performance with 4,435 images. The classifiers were for AMD and vascular occlusion, both with accuracies of 99.1%, sensitivities over 99%, and specificities of 88.9%. In contrast to the above studies, our independent testing performance, the AUC, sensitivity, accuracy, and specificity of the DTL were 0.926, 88.17%, 87.18%, and 86.67%, respectively, and the results were relatively low. This may be attributed to the outputs of our model being divided into normal groups and abnormal groups, the latter including a multitude of disease states; thus, some rare and microlesions failed to be detected by DTL. Previous studies have demonstrated that AI will become a tool to quickly and reliably detect and diagnose eye diseases based on medical imageology. AI-based DL could be used with high sensitivity and accuracy in the detection and identification of fundus diseases. The application of AI in ophthalmology may increase accessibility and achieve high efficiency in large-scale eye disease screening programs. Although some studies have shown outstanding research results, some limitations should be considered. First, most of the studies required a large manually labeled dataset to train and validate, which requires considerable time, manpower, and material resources. The diagnosis varies depending on the region. Second, more thorough research of false-negative values should be performed to recognize features and relevance. By comparison, our study is, to our knowledge, the first to develop a DTL to detect abnormal fundus images by employing a small dataset.
The deep transfer learning classification has been used for many years in disease screening research. Santin et al[17]. performed transfer learning to characterize the abnormal cartilage by using a pretrained neural network VGG16 and adapted the final layers to a binary classification problem. The AUC, sensitivity, and specificity of their study were 0.72, 83%, and 64%, respectively. In an independent sample of 189 new thyroid images, the AUC was 0.70. Compared with this study, they all deployed a small dataset, but the performance of the Inception-ResNet-v2 architecture was significantly better than that of the VGG16. Similarly, Heisler M, et al. [18] demonstrated three different transfer learning methods to identify the cones in a small set of AO-OCT images using a base network trained on AO-SLO images, which all obtained results similar to that of a manual rater. Using the results from the fine-tuning (Layer 5) method, they calculated four different cone mosaic parameters that were similar to the results found in AO-SLO images, showing the utility of their method. Christopher et al. [19] demonstrated that deep learning methodologies have high diagnostic accuracy for identifying fundus photographs with glaucomatous damage to the ONH in a racially and ethnically diverse population. The best performing model was the transfer learning ResNet architecture, which achieved an AUC of 0.91 in identifying glaucomatous optic neuropathy (GON) from fundus photographs, outperforming previously published accuracies of automated systems for identifying GON in fundus images. These deep learning systems showed that the models can learn faster by employing transfer learning with fewer data.
In this study, the reasons for false-negative cases of the testing datasets were analyzed. High myopic fundus accounted for approximately more than half of all false-negative cases. These results could contribute to our experts labeling mild myopic fundus as normal. Therefore, the model confused mild myopic fundus images and pathologic myopic images. In the same way, false-positive cases include mild myopic fundus. Other reasons for false negatives included peripheral retinal microlesions, vascular microlesions, optic neuritis, and congenital optic neuropathy.
This study presented an automated screening model that was trained with a relatively smaller number of fundus images. It can attain clinically acceptable performance in abnormal fundus image detection and will benefit medical institutions with no retinopathy screening program or a lack of experienced ophthalmologists. Additionally, the study shows our proposed model with high accuracy and reproducibility in detecting abnormal fundus images, even though it trained with a limited dataset. The DTL will permit users to utilize relation-labeled graph data to construct a detection model for the target image data. In this study, the transfer learning algorithm shows a well-applied prospect in community health care centers for screening retinal disease. The techniques described in this study, with great potential, apply in other medical field image classifications.
DTL is surprisingly effective in image classification. However, our study in its current state has several limitations. First, due to a training set in which our experts labeled mild myopic fundus as normal, the DTL trained on this set accessed a higher than normal prior probability for eye disease detection, which may cause a high false-negative rate. Second, our study dataset is not large and includes only patients from a local clinical setting. At present, the algorithm cannot be independent or matched with professional evaluation, but it can provide abnormal fundus images with obvious diagnoses so that ophthalmologists can focus on more difficult cases.