Automated Glaucoma Detection Techniques: an Article Review

doi:10.21203/rs.3.rs-4059572/v1

Download PDF

Short Report

Automated Glaucoma Detection Techniques: an Article Review

https://doi.org/10.21203/rs.3.rs-4059572/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Glaucoma, a vision-threatening disease, poses a serious concern and requires early detection and intervention to avoid irreversible vision loss. Significant advances in automated glaucoma detection techniques have done through the using of machine and deep learning techniques. An overview of these techniques will be provided in the article review. What sets our review apart is its exclusive focus on ML and DL techniques for glaucoma detection using the preferred reporting items for systematic reviews and meta-analysis (PRISMA) guidelines for filtering the papers. To achieve this, an advanced search on the Scopus database was conducted, specifically looking at research papers from 2023 with the keywords " Glaucoma AND detection OR machine AND learning OR deep AND learning ". Of the 324 papers found, we carefully selected 21 papers to include in our final review using the advanced query. The review offering valuable insights for clinicians and summarizes the recent techniques used by ML and DL for glaucoma detection, including the algorithms, databases, and evaluation criteria.

Deep Learning

Eensemble Learning

Machine Learning

Glaucoma Detection

Medical Images

Fundus Images

Glaucoma, which damages the optic nerve and is primarily caused by elevated intraocular pressure, affects millions of people worldwide and imposes a significant burden on healthcare systems. According to the World Health Organization (WHO), the global prevalence of glaucoma is expected to rise, as shown in Figure(1) [1], which highlights the critical importance of early detection to preserve vision for people with this disease. The existing traditional method of diagnosis requires that eye specialists perform manual tests, which may consume a significant amount of time and be subjective, depending on the investigator's expertise; therefore, improved methods needed for a rapid and accurate diagnosis [2, 3].

If you develop glaucoma, it can mean that you will face a decline in your quality of life. The condition, which is usually associated with Elevated Intraocular Pressure (IOP), can be influenced by medical conditions such as sleep apnea, genetic predisposition or simply aging [4]. Glaucoma can be categorized into two primary forms: open-angle and closed-angle. Glaucoma is based on the angle of the anterior chamber in the eye [5, 6].

Machine Learning (ML) and Deep Learning (DL) techniques have demonstrated potential in diverse applications, such as predicting [7–9], recognizing emotions [10–12], creating a 3D model of an object from 2D images [13], classification and detection in various fields [14–16]. With the rise of digital imaging technologies and advances in ML and DL techniques, automated glaucoma detection has gained significant traction in recent years. The aim is to automate the detection process through fundus image analysis, allowing early intervention and improving patient outcomes. Utilizing of fundus imaging holds immense promise in this context. As a noninvasive technique, fundus imaging offers readily available accessibility and provides crucial insights into the eye's condition, particularly the optic nerve heads. These images are adept at capturing intricate details throughout the retina, including essential features such as the neuroretinal rim, Optic Disc (OD), fovea, blood vessels, and Optic Cup (OC) [17–19]. These cutting-edge techniques offer incredible potential for a faster, more objective diagnosis, allowing healthcare professionals to provide timely interventions to patients at risk [18]. Therefore, it is crucial to focus on studying the detection of glaucoma using ML and DL techniques. These techniques can quickly and accurately process and analyze large amounts of complex data, which is especially valuable considering the vast amount of data collected from various sources. One of the great benefits of using these methods to detect glaucoma is that they pave the way for researchers to develop more reliable and accurate models. These models can then identify new patterns and trends that help in early detection of the disease. The purpose of this article review is to provide readers with a detailed and up-to-date understanding of recent advances in ML and DL-based glaucoma detection, and the main contribution is to gather more information from recently proposed research about glaucoma detection that has not yet been reviewed by any authors.

The article review is organized as follows: Section 2 will explain the methodology used to select the reviewed papers, Section 3 will describe the selected papers, the databases used, and discuss the evaluation criteria used in the selected papers, Section 4 is for discussion, and finally Section 5 is for the conclusions.

We are collecting research articles that were published in 2023. The following terms were used to search for article titles, abstracts, and keywords from the Scopus scientific database:

(Glaucoma AND detection OR machine AND learning OR deep AND learning). The results were restricted to subjects of computer science and engineering that were all in English. Moreover, this review focused on studies that have a document-type article. A total of 324 articles were found in 2023, 114 articles were removed when applying the subject area filter, and 115 articles were removed after the applied document type filter. 20 articles were removed after the application of keywords (Glaucoma, Deep Learning, Machine Learning), 42 articles were removed that were not in English and those that were not available for free. After reading the titles and abstracts of the remaining, we removed 12 articles, leaving 21 relevant articles. The Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA) guidelines and their extensions for scoping reviews were followed throughout this work. Figures (2) illustrates the PRISMA flowchart used in this review. The readers can find further details in [20].

In this section, the selected papers are discussed below:

Sumaiya Pathan et al., [3] suggested a multi-step system for the glaucoma detection automatically. The process involves preprocessing images and then isolate the Region Of Interest (ROI) through the analysis of statistical features. Then, clinical and texture-based features are extracted from the ROI. Finally, an ensemble of classifier models is constructed using dynamic selection techniques. Evaluations were carried out on both public databases and 300 hospital images. The most promising results came from an ensemble of Random Forest (RF) models with the META-DES dynamic ensemble selection technique. In the hospital database, this method achieved an impressive 100% accuracy, specificity, and sensitivity. The average accuracy, specificity, and sensitivity in the RIM-ONE and DRISHTI-GS databases reached 97.86%, 100%, and 93.85%, 97%, 90%, and 100% respectively. These outcomes demonstrate the effectiveness of their system, particularly with the ensemble of RF models employing META-DES.

Ruben Hemelings et al., [21] used labeled fundus images from 13 diverse data sources. These included BMES, GHS, and eleven databases that are publicly available. To minimize data discrepancies, the authors developed a standardized image processing strategy to extract images centered on 30° discs from the original data. The testing model involved a total of 149,455 images. For the BMES and GHS cohorts, the Area Under the Receiver Operating Characteristic (AUROC) curve achieved 0.976 and 0.984 at the participant level, respectively. At a specific specificity of 95%, sensitivities were 87.3% and 90.3%, respectively, surpassing the minimum 85% sensitivity recommended by Prevent Blindness America. 0.854 to 0.988 is the range of values of the AUROC in the eleven databases.

Veronika Kurilová et al., [22] In this paper, an average voting ensemble of multiple Convolutional Neural Networks (CNNs) models trained using the REFUGE database achieved the highest accuracy 98% and the AUROC score, surpassing the individual VGG-16, ResNet-50, and MobileNet models. Among the single CNN models, ResNet-50 demonstrated the best performance. Ensemble methods can significantly improve predictive performance, but including weaker-performing models can negatively impact overall results.

Gavin D’Souza et al., [23] used AlterNet-K, a compact model that merges ResNets and multi-head self-attention, trained on the Rotterdam EyePACS AIROGS database achieved 91.6% accuracy, 0.968 AUROC, and 91.5% F score in glaucoma detection, outperforming various transformer and standard CNN models. The model's success is attributed to its alternating pattern of ResNet blocks and multi-head self-attention, which leverages their complementary strengths for better generalizability. The results suggest that smaller, parameter-efficient CNNs combined with multi-head self-attention can achieve high accuracy in medical image classification tasks, potentially outperforming larger models.

Sajib Saha et al., [24] developed a system, powered by CNN that demonstrates that exceptional accuracy in the detection of glaucoma using color images of the funds. The system first isolates the OD with a custom YOLO network, followed by a glaucomatous vs. non-glaucomatous classification via a MobileNet architecture. Extensive testing with seven state-of-the-art CNNs yielded outstanding results, including 97.4% accuracy and 97.3% F score with sensitivity, specificity, and AUROC 97.5%, 97.2%, and 0.993 respectively.

Latif J. et al., [25] employed the Enhanced Grey Wolf Optimized Support Vector Machine (EGWO-SVM) for image processing. First, they eliminated noise using the Adaptive Median Filter (AMF). Then Speeded-Up Robust Features (SURF), Histogram of Oriented Gradients (HOG), and global features were used for extracting features. Classification utilized the EGWO technique along with SVM. The ORIGA database was used for testing, which produced remarkable performance metrics with a high accuracy of 94%, specificity of 92%, and sensitivity of 92%.

Milon Biswas et al., [26] proposed a lightweight to detect CNN retinal disorders, focusing on binary judgments: distinguishing healthy and non-healthy cases, specifically within non-healthy image screening. They evaluated its performance on two well-defined public databases. To differentiate healthy from non-healthy images, CNN reached an accuracy of 99.67% on databases of Diabetic Retinopathy (DR) and 96.5% on databases of Glaucoma (GL). Furthermore, in the context of non-healthy screening, aiming to differentiate between different retinal disorders, CNN achieved an accuracy of 99.03% when distinguishing between cases of GL and DR.

Ghorui A. et al., [27] proposed a novel CNN architecture called ProspectNet. It outperforms two established pre-trained networks, VGG16 and DenseNet121, by exhibiting higher accuracy with reduced computational time and complexity. They used a combined database from DRISHTI-GS and the glaucoma database (Kaggle), containing ocular color fundus images of normal and glaucomatous eyes. ProspectNet achieves an AUROC of 0.991, a specificity of 98%, and a precision of 98%.

Alice K. et al., [28] aimed to create models to detect glaucoma using ML algorithms and image feature descriptors in a publicly accessible retinal fundus image database. The goal was to classify the images as normal or abnormal. Their classification process occurred in two stages: first, the image features were extracted using specific filters, followed by training a tree-based ensemble classifier. Then, this classifier was tested to achieve optimal accuracy. The experiment was carried out iteratively exploring three effective filters: Edge Histogram (EH), Pyramid Histograms of Orientation Gradients (PHOG), and Fuzzy Color and Texture Histogram (FCTH). They evaluated the combination of filters to determine the most effective one. The employing of the EH filter in conjunction with the FCTH, using an RF classifier, reached the highest accuracy of 80.43% and an AUROC of 0.884.

Ahmed MT. et al., [29] employed DL techniques to identify open-angle glaucoma in fundus images based on three distinct architectures: VGG16, VGG19, and ResNet50. They classified the eyes as positive or negative for glaucoma using the Kaggle database. In particular, data augmentation significantly improved the performance of all three models, with accuracy ranging from 93–97.56%. Among them, VGG19 proved to be the most accurate, with accuracy 97.56%.

Raju M. et al., [30] utilized four different ML classification methods in Electronic Health Records (EHR) from more than 650 medical facilities in the US to predict glaucoma before clinical symptoms manifest, allowing for potential early intervention and preventive treatments. XGBoost, Multilayer Perceptron (MLP) and RF exhibited similar favorable results with an AUROC score of 0.81, while Logistic Regression (LR) achieved a score of 0.73. These models effectively predicted glaucoma one year before onset based on patient EHR data, suggesting the potential of ML to identify patients before glaucoma.

N. J. Shyla and W. S. Emmanuel [31] propose a technique for OD segmentation and classification using DL and Pattern Classification Neural Networks (PCNs). First, they resize the input image and employ level-set segmentation for OD segmentation. AlexNet is then used for classification into normal and glaucoma classes. Additionally, they feed the glaucoma images to the PCN to classify them as initial, moderate, or severe. The Neural Network (NN) is trained using statistical features and the Cup-to-Disk Ratio (CDR). This work, performed in the DRISHTI-GS, LAG, and RIM-ONE databases, achieved an accuracy, sensitivity, and specificity of 98.42%, 97.6%, and 97.5%, respectively.

Venkateswara Rao Naramala et al., [32] used Restricted Boltzmann Machines (RBM) to extract and analyze multiple features from retinal images to classify anomalies and automate the diagnostic process. Additionally, the investigation involved the use of a U-network model to segment the optical images and the application of the Squirrel Search Algorithm (SSA) to fine-tune the hyperparameters of RBM for optimal performance. For evaluation, the RIM-ONE database was used. The proposed model achieves 99.2% accuracy in the database.

Yao Li et al., [33] explored Drop-Coating Deposition Raman Spectroscopy (DCDRS) as a new, non-invasive method to distinguish between patients with glaucoma and healthy individuals using tear samples. Tears from 63 individuals were analyzed for their Raman spectra. High-dimensional Raman data were processed using Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA) to identify key features, and then an SVM classifier based on the PCA-LDA results was used to categorize samples. DCDRS successfully differentiated between patients with glaucoma and healthy individuals with a total accuracy of 93.2%. Differences in protein and lipid content in tears, reflected in Raman spectra, contributed to the classification. With 30% validation of the test database, the classification accuracy remained at 90.9%.

Reshma Verma et al., [34] suggested SVM and K-means clustering compared to determine of the CDR from fundus images. SVM outperformed K-means in accuracy and consistency. Used a convex hull approach for diagnosis and classification and developed a web application for inexpensive and user-friendly screening. SVM achieved better accuracy and consistency in CDR determination. Identification of the severity of early-stage glaucoma was possible. The Web application offers a cost-effective screening tool. The limitations in this paper were that the convex hull algorithm for contour joining might be slow and the study relies on OCT images captured by trained professionals with specialized equipment.

Zefree Lazarus Mayaluri and Satyabrata Lenka [35] presented a modified dichromatic reflection model to separate specular reflections from corrupted fundus images. For this separation task, a modified U-Net CNN is used which is also used to accurately segment relevant ROI from the preprocessed images. The relevant features are extracted from segmented images, likely representing the morphological and structural characteristics related to glaucoma. An SVM classifier, trained with different kernels, is applied to classify the images into glaucomatous or non-glaucomatous categories based on the extracted features. After comparing seven existing methods to obtain diffuse and specular components, they adopted the one that produced the highest quality images and used the output image in subsequent steps of the screening process. The experimental results demonstrated that their model achieved a maximum improvement of 37.97 dB in PSNR and 0.961 in SSIM during the preprocessing step. The model reached an accuracy of 91.83%, a sensitivity of 96.39%, a specificity of 95.37%, and an AUROC of 0.971 for detection.

M. Raveenthini and R. Lavanya [36] presented a new Computer Aided Diagnosis (CAD) system to diagnosing DR and glaucoma in multiple eyes simultaneously. This could be a game changer for large-scale screening programs by significantly reducing manpower and time requirements. They eliminate the need for separate DR and glaucoma systems by using a segmentation-independent approach. This avoids issues with image quality and anatomy that can affect segmentation accuracy. They constructed an ensemble of an RF classifier with CNNs. Utilizes non-linear features like Higher Order Spectra (HOS), fractal, and entropy features, which capture essential image details beyond basic pixel intensities. Ensemble learning combines the strengths of both the RF and DL models using the sum rule for improved accuracy, sensitivity, and specificity.

Senthil kumar Arunachalam et al., [37] introduced Deep Neural Perona–Malik Diffusive Mean Shift Mode Seeking Segmented Image Classification (DNP-MDMSMSIC) for glaucoma detection and Stargardt disease. It uses space-variant Perona–Malik diffusive preprocessing to reduce noise while preserving edges. Feature extraction is used to extract intensity, color, and texture with high accuracy, and then mean shift mode seeking segmentation is used to segment the image based on features. The Bregman divergence function classifies images on the basis of segmented region similarity. DNP-MDMSMSIC achieved a 8% higher accuracy and a 20% faster detection than previous methods in the ACRIMA database.

Somasundaram Devaraj and Senthil Kumar Arunachalam [38] proposed Max Pool Convolution Neural Kuan Filtered Tobit Regressive Segmentation based Radial Basis Image Classifier (MPCNKFTRS-RBIC) for the detection of early glaucoma and Stargardt disease with high accuracy and low processing time. It uses a weighted adaptive Kuan filter for preprocessing the fundus image. Feature extraction is to extract intensity, color, and texture with high accuracy then Tobit regressive segmentation to partition image based on extracted features. The radial basis function classifier was used to analyze images for classification. MPCNKFTRS-RBIC achieved good performance on various metrics in different image sizes and databases.

Alifia Revan Prananda et al., [39] suggested analyzing damage to the retinal nerve fiber layer for glaucoma detection. This proposed method had two steps: the preprocessing and the classification process. In the first step, unnecessary parts, such as the OD and blood vessels, are removed because they could hinder the analysis. For classification, nine DL architectures were used. The proposed method achieved the highest accuracy of 92.88% with an AUROC of 0.8934 when evaluated on the ORIGA database.

Abdelali Elmoufidi et al., [40] suggested automating the diagnosis of glaucoma using fundus images. Their framework operates as follows: ROIs are decomposed into components using the Bi-dimensional Empirical Mode Decomposition (BEMD) algorithm. The DL features are extracted from these decomposed components using the VGG19 CNN architecture. These features are then aggregated for each ROI using a bag-of-features approach. Due to their high dimensionality, the features are subsequently reduced using PCA. The resulting bags of features serve as input to an SVM classifier for the final diagnosis. The public databases ACRIMA and REFUGE were used for model training, while the testing involved a combination of the ACRIMA, REFUGE, ORIGA-light, RIM-ONE sjchoi86-HRF, and Drishti-GS-GS1. The REFUGE-trained model achieved overall accuracies of 98.31%, 98.61%, 96.43%, 96.67%, 95.24%, and 98.60% on ACRIMA, REFUGE, RIM-ONE, ORIGA-light, Drishti-GS-GS1, and sjchoi86-HRF, respectively. Similarly, the model trained on ACRIMA achieved accuracies of 98.92%, 99.06%, 98.27%, 97.10%, 96.97%, and 96.36% on the same databases, respectively. The above-reviewed articles have been summarized in Table (1).

Table (1) A Summary of reviewed papers.

Reference number	Database	Pre-processing	Feature Extraction	Classification and detection	Results (%)
[3]	RIM-ONE, DRISHTI-GS, and 300 fundus images from a hospital	Blood vessels removal, OD and OC segmentation	Texture directionality feature extracted from N + 1 directional difference of Gaussian, Gabor, Hu-invariant moments, and color features, along with gray-level co-occurrence matrix based features	Using dynamic selection techniques, two types of ensemble of classifiers were used: 1- The homogeneous ensemble of RF classifiers 2- The heterogeneous ensemble of classifiers	The most promising results came from an ensemble of RF: For the hospital database: accuracy: 100, specificity: 100, and sensitivity: 100 RIM-ONE: accuracy: 97.86, specificity: 100, and sensitivity: 93.85 DRISHTI-GS: accuracy: 97, specificity: 90, and sensitivity:100
[21]	BMES, GHS, AIROGS, ORIGA, LAG, ODIR, REFUGE1, REFUGE2, RIM-ONEr3, RIM-ONE DL, GAMMA, ACRIMA, and PAPILA	Different steps with different databases	CNN		BMES and GHS: AUROC: reached 0.976 and 0.984 respectively. For the remaining databases, the AUROC ranged from 0.854 to 0.988.
[22]	REFUGE	-	The ernsemble method consisted of ResNet-50, VGG-16, and MobileNet		AUROC, precision, recall, true positives, true negatives, false positives, and false negatives. The best accuracy for the ensemble methods using the average voting method was 98
[23]	Rotterdam EyePACS AIROGS	Cropping and resizing images to 224 × 224	AlterNet-K		Accuracy: 91.6, Recall: 90.7, AUROC: 0.968, F score: 91.5
[24]	LAG, sjchoi86 HRF, ACRIMA, DRISHTI-GS, HRF, DRIONS-DB, and RIM-ONE	YOLO CNN		MobileNet	Accuracy: 97.4, F score: 97.4 Sensitivity: 97.5 Specificity: 97.2 AUROC: 99.3
[25]	ORIGA	AMF to eliminate image noise	SURF, HOG, and global features	EGWO with SVM	Accuracy: 94 Specificity: 92 Sensitivity: 92
[26]	DR and GL were collected from Kaggle and MESSIDOR	-	DL (light-weighted CNN)		For distinguishing between non-healthy and healthy images, the accuracy was: 99.67 on DR database and 96.5 on the GL database. Accuracy: 99.03 when distinguishing between cases of GL and DR.
[27]	DRISHTI-GS and Kaggle	Binary masking technique to determine ROI, converted images into grayscale, and resizing them to 224 × 224	DL (DenseNet121, VGG16, and ProspectNet)		ProspectNet has AUROC: 0.991, specificity, and precision: 98
[28]	DB1, DB2, and DB3 obtained from RIM_ONE	DB1 and DB2 images merged into one folder. Sequential file naming generates class names for an arff file with filenames and class values. Images were transformed to numeric data using filters	EH, FCTH, and PHOG filters	RF	Accuracy: 80.43
[29]	Kaggle	Images resized to 448×672 pixels, and data biases were reduced	DL (VGG16, VGG19, and ResNet50)		Highest accuracy is VGG19: 97.56
[30]	EHR from over 650 hospitals and clinics throughout the US	Filtering, transformation, binarization, and joining techniques	systemic diseases, medication, and demographic information	LR, XGBoost, MLP, and RF	XGBoost, MLP, and RF achieving an AUROC score: 0.81
[31]	LAG, RIM-ONE, and DRISHTI-GS	Resize images, and segmentation	Various statistical features and CDR are used for training NN	AlexNet	Accuracy: 98.42 Specificity: 97.5 Sensitivity: 97.6
[32]	RIM-ONE	cropping, channel separation, data enhancement, and segmentation of images using U-network	RBM with SSA for selecting optimal hyperparameters and decreasing the weight of the RBM		Accuracy: 99.2
[33]	Clinical data of glaucoma patients and normal people	Noise removal, subtraction, and normalization	PCA-LDA dimensionality reduction	SVM	Accuracy: 93.2
[34]	DRISHTI-GS, real database	CLAHE filter is applied to increase the contrast adaptive to the image and reduce noise	Extraction OD and OC	SVM, K-mean clustering, and Convex hull	The best accuracy for SVM: 85.39
[35]	Mendeley data repository	A modified U-Net CNN Separates specular reflections from corrupted fundus images, and segmentation	CDR, Retinal nerve fiber layer, Neuro-Retinal Rim, INST, and Statistical features	SVM with different kernels	Accuracy: 91.83 Sensitivity: 96.39 Specificity: 95.37 AUROC: 0.971
[36]	Single database from HRF, Kaggle, ORIGA-light, and DR HAGIS	Resize, extract green channel, noise removal, contrast enhancement, and alleviating non-uniform illumination	RF classifier and a pool of non-linear features including HOS, fractal, and entropy features. Using the sum rule for decision fusion, an ensemble of RF model and CNN was further constructed		Accuracy: 98.08 Sensitivity: 98.37 Specificity: 99.07
[37]	ACRIMA, retina image bank database and DIARETDB0	remove the noise	deep neural network	Mean shift mode seeking segmentation algorithm and Bregman Divergence function in output layer for classification	Enhancement in different metrics
[38]	ACRIMA, and retina image bank database	Resizing images, noise removal, and enhancing the quality	MPCNKFTRS-RBIC		Enhancement in different metrics
[39]	ORIGA`-light	Unnecessary parts like blood vessels and OD were removed	AlexNet, GoogleNet, XceptionNet, ResNet-50, Inception V3, InceptionResNet, NasNet, MobileNet, and DenseNet		The highest Accuracy: 92.88 with an AUROC: 89.34 using DenseNet
[40]	ACRIMA, REFUGE, RIM-ONE, DRISHTI-GS, ORIGA-light, and sjchoi86-HRF	decompose the ROI	CNN architecture VGG19	SVM	Accuracy: 98.31, 98.61, 96.43, 96.67, 95.24, and 98.60 are obtained on ACRIMA, REFUGE, RIM-ONE, ORIGA-light, DRISHTI-GS, and sjchoi86-HRF databases, respectively, by using the model trained on REFUGE. Against an accuracy: 98.92, 99.06, 98.27, 97.10, 96.97, and 96.36 are obtained on the same databases as above using model training on ACRIMA

For the selected reviewed papers, researchers were used in their works ML only, DL only, or ensemble learning that uses different ML and/or DL. The ratio of using ML, DL, and ensemble learning that uses different ML and/or DL methods appears in Figure (3).

3.1. Databases

From Table 1, we can see that the most used databases are shown in Figure (4). All databases used for the detection of glaucoma used in the selected papers are listed in Table 2.

Table (2) The databases used in the reviewed papers.

Reference number	Database	Availability	Normal images	Glaucomatous or (suspect) images	Total images
[41]	RIM-ONE	Public	118	51	169
[42]	RIM-ONEr3	Public	85	74	159
[43]	Drishti-GS	Public	70	31	101
[44]	RIM-ONE DL	Public	313	172	485
[24]	Sjchoi86 HRF	Public	300	101	401
[45]	Rotterdam EyePACS AIROGS	Public	-	-	112732
[21]	ODIR	Public	-	-	10000
[46]	GAMMA	Public	150	150	300
[47]	PAPILA	Public	333	155	488
[48]	LAG	Public	3432	2392	5824
[49]	DIARETDB0	Public	20	110	130
[50]	Kaggle	Public	-	-	1000
[51]	DR HAGIS	Public	0	10 and the remaining images for hypertension, diabetic retinopathy, and age-related macular degeneration	39
[52]	Mendeley data repository	Public	1060	1146 with artifacts	2206
[53]	ORIGA-light	Public	482	168	650
[54]	ACRIMA	Public	309	396	705
[55]	DRIONS-DB	Public	55	55	110
[56]	REFUGE	Public	1080	120	1200
[57]	MESSIDOR	Public	-	-	1200
[58]	HRF	Public	15	15 + 15 images for DR	45

3.2. Evaluation Criteria

Different evaluation criteria have been used to analyze the effectiveness of the proposed models. The description of them as follows [59, 60]:

Sensitivity (recall): represents the percentage of True Positives (TP) that the model correctly identifies. TPs are instances in which the model correctly predicts the presence of the target class. The equation of it appears in Eq. (1):

Sensitivity = Recall = TP/TP + FN (1)

Here, FN is for the False Negative. The higher the sensitivity, the better the performance of the proposed model will be.

Specificity: represents the percentage of True Negatives (TN) that the model correctly identifies. TNs are instances where the model correctly predicts the absence of the target class. It can be calculated from Eq. (2) as follows:

Specificity = TN/TN + FP (2)

Here, FP is for the False Positive. If there is more value of specificity, the model shows better performance.

Accuracy: shows information on how accurately the ground truths match the segmented result. The improved accuracy shows a better outcome of the proposed model. It can be computed as shown in Eq. (3):

Accuracy = Sensitivity + Specificity/2 (3)

Precision: measures the proportion of positive predictions that are actually correct.

F1 score: it is an accuracy degree. A higher F1 score means better performance of the proposed model, which refers to the model's ability to correctly identify TP and avoid FP. It can be computed from Eq. (4) as follows:

F1 score = 2*precision*recall / precision + recall (4)

AUROC: shows the trade-off between True Positive Rate (TPR) and False Positive Rate (FPR) at different threshold settings. An AUROC with a higher value indicates a better overall discrimination between positive and negative cases, regardless of specific threshold choices. Table (3) provides a clear and concise overview of these evaluation criteria:

Table (3) Summary of the evaluation criteria used in the reviewed papers.

Metric	Definition	Interpretation	Limitations
Accuracy	% of correct predictions	Overall performance	Misleading in imbalanced databases
Specificity	% true negatives correctly identified	Effectiveness in avoiding false positives	Not relevant if TN are unimportant
Sensitivity (recall)	% true positives correctly identified	Ability to find all relevant cases	Not relevant if FN are unimportant
Precision	% of positive predictions that are actually correct	Proportion of positives that are TP	Not relevant if FP are unimportant
F1 score	Harmonic mean of precision and recall	Balanced view of precision and recall	Requires equal importance of FP and FN
AUROC	Area under ROC curve	Performance across different classification thresholds	Complex interpretation, not directly indicating class probabilities

Through this article review, several studies were reviewed that used ML and DL techniques related to the automatic detection of glaucoma, which affects the eye, due to the importance of early detection in limiting the aggravation of the disease and its access to very bad conditions, such as complete loss of vision. The aforementioned techniques include several algorithms, some of which have been used without any improvement or addition in the way they are implemented, or have been improved, or algorithms have been combined, and have shown promising results.

As shown in the methodology of this paper, the number of total scientific papers for a specific year (2023) in just the Scopus database was 324 papers. This means the automatic diagnosis of glaucoma disease using ML or DL techniques has recently been importance due to the success of these techniques in the detection of diseases in general and glaucoma especially.

The presented studies were carried out on different databases from several sources (either freely available on scientific websites or obtained from patient records in hospitals). The databases include medical images of glaucoma, such as fundus images. The utilization of preprocessing methods has contributed to improving performance metrics.

The algorithm was applied to filter the related work and was effective in selecting the important papers using systematic method and meta-analysis. Therefore, the technique, the database, the evaluation criteria, and the results of each technique were extracted in each related work. Summarizing the techniques, the type of database, the evaluation criteria, and the results in tables.

The researchers in the selected papers used different methods such as ML, DL, and ensemble learning. The best performance metrics obtained by using the ensemble learning recorded in the reviewed papers were for homogeneous RF classifiers: for the hospital database: accuracy: 100%, specificity: 100%, and sensitivity: 100% [3]. The reason the RF method has more accuracy than the other method is that it combines the predictions of multiple decision trees to make the final prediction. Although the best performance metrics obtained from using the ML recorded in the reviewed papers was for SVM since it was tested on different databases and achieved accuracy of 98.31%, 98.61%, 96.43%, 96.67%, 95.24%, and 98.60% are obtained on ACRIMA, REFUGE, RIM-ONE, ORIGA-light, DRISHTI-GS, and sjchoi86-HRF databases, respectively, by using the model trained on REFUGE. Against an accuracy: 98.92%, 99.06%, 98.27%, 97.10%, 96.97%, and 96.36% are obtained in the same above databases using model training on ACRIMA [40]. Finally, the best performance metrics obtained from using the DL recoded in the reviewed papers was for light-weighted CNN, which was obtained to distinguish between non-healthy and healthy images with an accuracy of 99.67% on the DR database and 96.5% on the GL database. In the context of non-healthy screening, CNN accuracy was 99.03% when distinguishing between GL and DR cases [26].

In summary, these studies demonstrated promising results; each one had a different accuracy for glaucoma detection depending on the type of the database, and the techniques used. The potential of these technologies to enhance early glaucoma identification and patient care is suggested, but many promising computer vision methods have not been adequately tested in practical, real-world scenarios. This raises concerns about their effectiveness when applied to actual clinical or diagnostic problems. To bridge this gap, a crucial step is fostering close collaboration between computer vision engineers and clinicians. Such interdisciplinary teams can ensure a more gradual and informed deployment of ML and DL algorithms in real-world settings.

The current review explored various ML and DL models used to detect glaucoma. The main goal of these models is to provide a direct clinical diagnosis and the timely identification of patients who need referrals or surgical intervention. The aim is to minimize vision loss and visual field defects in patients with glaucoma. Given the challenges facing the healthcare system, the combination of ML and DL methods and medicine is becoming increasingly important to meet the growing medical demands. ML and DL analysis reveal that its segmentation and judgments are in line with those made by ophthalmologists in their diagnoses. In addition, they can also detect certain features that may go unnoticed by humans. This capability is expected to play a crucial role in future advancements, as it will improve our understanding of disease mechanisms and collect more information from raw images.

This article introduced a systematic review and meta-analysis. The total of papers, which are more related to glaucoma detection, have been introduced. The related work had been filtered using a specific algorithm using more than one-step and depends on more than one condition. In addition, the techniques of each related work, the database, and the evaluation criteria have been summarized in dependent tables. Comparison of the related work depending on techniques, databases, evaluation criteria, and results. The results presented in these studies offer compelling evidence of the promising performance of ML and DL-driven glaucoma detection systems and the potential clinical utility of these technologies.

Detecting glaucoma in a timely and accurate manner is a benefit for millions of people around the world. These technological advances not only have the capacity to improve diagnostic precision but also hold the promise of enabling earlier interventions and improved patient outcomes. Collective insights from the reviewed studies paint a compelling picture of these methods in reshaping the detection of glaucoma. The fusion of cutting-edge technology with the expertise of ophthalmologists offers a potent synergy that has the potential to redefine the landscape of glaucoma care, ultimately leading to an improved quality of life for patients and a brighter future for vision health.

Author Contribution

Wisal Hashim Abdulsalam and Rasha H. Ali wrote the main manuscript text. Samera Shams Hussien prepared figures 2-4.

Organization WH (2019) World report on vision. Available: https://www.who.int/docs/default-source/documents/publications/world-vision-report-accessible.pdf
Wongwai J, Buranasiri P, Pasupa K, Manassakorn A (2023) Analysis of volumetric 3D reconstruction of lamina cribrosa images from swept-source optical coherence tomography in glaucomatous and healthy subjects. Biomedical Opt Express 14:4627–4643. https://doi.org/10.1364/BOE.497242
Pathan S, Kumar P, Pai RM, Bhandary SV (2023) An automated classification framework for glaucoma detection in fundus images using ensemble of dynamic selection methods. Progress in Artificial Intelligence.1–15. https://doi.org/10.1007/s13748-023-00304-x
Mathew JC, Ilango V, Asha V (2023) Machine Learning Techniques, Detection and Prediction of Glaucoma- A Systematic Review. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11: 283–309. https://doi.org/10.17762/ijritcc.v11i5s.6655
De Fauw J, Ledsam JR, Romera-Paredes B, Nikolov S, Tomasev N, Blackwell S et al (2018) Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med 24:1342–1350. 1546-170X.https://doi.org/10.1038/s41591-018-0107-6
Nongpiur ME, Haaland BA, Friedman DS, Perera SA, He M, Foo L-L et al (2013) Classification algorithms based on anterior segment optical coherence tomography measurements for detection of angle closure. Ophthalmology 120:48–540161. https://doi.org/10.1016/j.ophtha.2012.07.005
Ali RH, Abdulsalam WH (2021) The Prediction of COVID 19 Disease Using Feature Selection Techniques. Journal of Physics: Conference Series, vol. 1879:022083. https://doi.org/10.1088/1742-6596/1879/2/022083
AL-Mukhtar M, Al-Zubaidi AS, Albadri MN (2024) Predicting COVID-19 in Iraq using Frequent Weighting for Polynomial Regression in Optimization Curve Fitting. Iraqi J Sci 455–467. https://doi.org/10.24996/ijs.2024.65.1.37
Ali RH (2022) Educational Data Mining For Predicting Academic Student Performance Using Active Classification. Iraqi J Sci. https://doi.org/10.24996/ijs.2022.63.9.27. ,3954–3965
Abdulsalam WH, Alhamdani RS, Abdullah MN (2019) Emotion Recognition System Based on Hybrid Techniques. Int J Mach Learn Comput 9:490–495. https://doi.org/10.18178/ijmlc.2019.9.4.831
Abdulsalam WH, Alhamdani RS, Abdullah MN (2019) Facial emotion recognition from videos using deep convolutional neural networks. Int J Mach Learn Comput 9:14–19. https://doi.org/10.18178/ijmlc.2019.9.1.759
Abdulsalam WH, Alhamdani RS, Abdullah MN (2018) Speech Emotion Recognition Using Minimum Extracted Features. 1st Annual International Conference on Information and Sciences (AiCIS), 58–61. https://doi.org/10.1109/AiCIS.2018.00023
Hussei SS, Altyar SS, Tawfeeq LA, Harba ES (2020) Reconstruction of Three-Dimensional Object from Two-Dimensional Images by Utilizing Distance Regularized Level Algorithm and Mesh Object Generation. Baghdad Sci J 17:0899–08992411. https://doi.org/10.21123/bsj.2020.17.3.0899
Assegie TA, Kumar NK (2022) Sequential feature selection for heart disease detection using random forest. Iraqi J Sci 3947–3953. https://doi.org/10.24996/ijs.2022.63.9.26
Abdulmunem IA (2022) Brain MR Images Classification for Alzheimer's Disease. Iraqi J Sci 30:2725–2740. https://doi.org/10.24996/ijs.2022.63.6.37
Mahmood RAR, Abdi A, Hussin M (2021) Performance evaluation of intrusion detection system using selected features and machine learning classifiers. Baghdad Sci J 18:0884–0884. https://doi.org/10.21123/bsj.2021.18.2(Suppl.).0884
Hemelings R, Elen B, Barbosa-Breda J, Blaschko MB, De Boever P, Stalmans I (2021) Deep learning on fundus images detects glaucoma beyond the optic disc. Sci Rep 11:20313. https://doi.org/10.1038/s41598-021-99605-1
Mahdi H, El N Abbadi (2022) Glaucoma Diagnosis Based on Retinal Fundus Image: A Review. Iraqi J Sci 4022–4046. https://doi.org/10.24996/ijs.2022.63.9.32
Athab SD, Selman NH (2020) Localization of the Optic Disc in Retinal Fundus Image using Appearance Based Method and Vasculature Convergence. Iraqi J Sci 164–170. https://doi.org/10.24996/ijs.2020.61.1.18
Moher D, Liberati A, Tetzlaff J, Altman DG, Group P (2009) Reprint-preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Physical therapy, vol. 89: 873–880. https://doi.org/10.1093/ptj/89.9.873
Hemelings R, Elen B, Schuster AK, Blaschko MB, Barbosa-Breda J, Hujanen P et al (2023) A generalizable deep learning regression model for automated glaucoma screening from fundus images. npj Digit Med 6:112. https://doi.org/10.1038/s41746-023-00857-0
Kurilová V, Rajcsányi S, Rábeková Z, Pavlovičová J, Oravec M, Majtánová N (2023) Detecting glaucoma from fundus images using ensemble learning. J Electr Eng 74:328–335. https://doi.org/10.2478/jee-2023-0040
D'Souza G, Siddalingaswamy PC, Pandya MA (2023) AlterNet-K: a small and compact model for the detection of glaucoma. Biomed Eng Lett. https://doi.org/10.1007/s13534-023-00307-6
Saha S, Vignarajan J, Frost S (2023) A fast and fully automated system for glaucoma detection using color fundus photographs. Sci Rep 13. https://doi.org/10.1038/s41598-023-44473-0
Latif J, Tu S, Xiao C, Bilal A, Ur Rehman S, Ahmad Z (2023) Enhanced Nature Inspired-Support Vector Machine for Glaucoma Detection. Computers, Materials & Continua, vol. 76. https://doi.org/10.32604/cmc.2023.040152
Biswas M, Chaki S, Mallik S, Gaur L, Ray K (2023) Light Convolutional Neural Network to Detect Eye Diseases from Retinal Images: Diabetic Retinopathy and Glaucoma. in Proceedings of the Fourth International Conference on Trends in Computational and Cognitive Engineering: TCCE 2022: 73–83. https://doi.org/10.1007/978-981-19-9483-8_7
Ghorui A, Chatterjee S, Makkar R, Pachiyappan A, Balamurugan S (2023) Deployment of CNN on colour fundus images for the automatic detection of glaucoma. Int J Appl Sci Eng 20:1–9. https://doi.org/10.6703/IJASE.202303_20(1).003
Alice K, Deepa N, Devi T, BeenaRani B, Nagaraju V (2023) Effect of multi filters in glucoma detection using random forest classifier. Measurement: Sens 25:100566. https://doi.org/10.1016/j.measen.2022.100566
Ahmed MT, Ahmed I, Rakin RA, Akter MT, Jahan N (2023) An effective deep learning network for detecting and classifying glaucomatous eye. Int J Electr Comput Eng (IJECE) 13:5305–5313. https://doi.org/10.11591/ijece.v13i5.pp5305-5313
Raju M, Shanmugam KP, Shyu C-R (2023) Application of Machine Learning Predictive Models for Early Detection of Glaucoma Using Real World Data. Appl Sci 13:2445. https://doi.org/10.3390/app13042445
Shyla NJ, Emmanuel WS (2023) Glaucoma detection and classification using modified level set segmentation and pattern classification neural network. Multimedia Tools Appl 82:15797–15815. https://doi.org/10.1007/s11042-022-13892-y
Naramala VR, Kumar BA, Rao VS, Mishra A, Hannan SA, El-Ebiary YAB et al (2023) Enhancing Diabetic Retinopathy Detection Through Machine Learning with Restricted Boltzmann Machines. International Journal of Advanced Computer Science and Applications. 14: 2023. https://doi.org/10.14569/IJACSA.2023.0140961
Li Y, Lin H, He Q, Zuo C, Lin M, Xu T (2023) Label-Free Detection and Classification of Glaucoma Based on Drop-Coating Deposition Raman Spectroscopy. Appl Sci 13:64762076–64763417. https://doi.org/10.3390/app13116476
Verma R, Shrinivasan L, Hiremath B (2023) Machine learning classifiers for detection of glaucoma. IAES International Journal of Artificial Intelligence. 12: 806%@ 2089–4872. https://doi.org/10.11591/ijai.v12.i2.pp806-814
Mayaluri ZL, Lenka S (2023) Hybrid glaucoma detection model based on reflection components separation from retinal fundus images. EAI Endorsed Trans Pervasive Health Technol 9:2411–7145. https://doi.org/10.4108/eetpht.9.3191
Raveenthini M, Lavanya R (2023) Multiocular disease detection using a generic framework based on handcrafted and deep learned feature analysis. Intelligent Systems with Applications. 17: 200184%@ 2667–3053. https://doi.org/10.1016/j.iswa.2023.200184
Devaraj S, Sridharan B (2023) Deep Perona-Malik Diffusive Mean Shift Image Classification For Early Glaucoma And Stargardt Disease Detection. Malaysian J Comput Sci 36:14–390127. https://doi.org/10.22452/mjcs.vol36no1.2
Devaraj S, Arunachalam SK (2023) Early Detection Glaucoma and Stargardt's Disease Using Deep Learning Techniques. 36. Intelligent Automation & Soft Computinghttps://doi.org/10.32604/iasc.2023.033200
Prananda AR, Frannita EL, Hutami AHT, Maarif MR, Fitriyani NL, Syafrudin M (2022) Retinal Nerve Fiber Layer Analysis Using Deep Learning to Improve Glaucoma Detection in Eye Disease Assessment. Appl Sci 13:372076–373417. https://doi.org/10.3390/app13010037
Elmoufidi A, Skouta A, Jai-Andaloussi S, Ouchetto O (2023) CNN with multiple inputs for automatic glaucoma assessment using fundus images. International Journal of Image and Graphics. 23:2350012%@ 0219–4678. https://doi.org/10.1142/S0219467823500122
Fumero F, Alayón S, Sanchez JL, Sigut J, Gonzalez-Hernandez M (2011) RIM-ONE: An open retinal image database for optic nerve evaluation. in 2011 24th international symposium on computer-based medical systems (CBMS). 1–6. https://doi.org/10.1109/CBMS.2011.5999143
Fumero F, Sigut J, Alayón S, González-Hernández M, De La González M (2015) Rosa Interactive tool and database for optic disc and cup segmentation of stereo and monocular retinal fundus images. presented at the WSCG 2015 Conference on Computer Graphics, Visualization and Computer Vision
Sivaswamy J, Krishnadas S, Joshi GD, Jain M, Tabish AUS (2014) Drishti-gs: Retinal image dataset for optic nerve head (onh) segmentation. in 2014 IEEE 11th international symposium on biomedical imaging (ISBI). 53–56. https://doi.org/10.1109/ISBI.2014.6867807
Batista FJF, Diaz-Aleman T, Sigut J, Alayon S, Arnay R, Angel-Pereira D (2020) Rim-one dl: A unified retinal image database for assessing glaucoma using deep learning. Image Anal Stereology 39:161–167. https://doi.org/10.5566/ias.2346
De Vente C, Vermeer KA, Jaccard N, Wang H, Sun H, Khader F et al (2023) AIROGS: Artificial Intelligence for robust glaucoma screening challenge. IEEE Trans Med Imaging. https://doi.org/10.1109/TMI.2023.3313786
Wu J, Fang H, Li F, Fu H, Lin F, Li J et al (2023) Gamma challenge: glaucoma grading from multi-modality images. Med Image Anal 90:102938. https://doi.org/10.1016/j.media.2023.102938
Kovalyk O, Morales-Sánchez J, Verdú-Monedero R, Sellés-Navarro I, Palazón-Cabanes A, Sancho-Gómez J-L (2022) Sci Data 9:291. https://doi.org/10.1038/s41597-022-01388-1. PAPILA: Dataset with fundus images and clinical data of both eyes of the same patient for glaucoma assessment
Li L, Xu M, Wang X, Jiang L, Liu H (2019) Attention based glaucoma detection: A large-scale database and CNN model. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10571–10580. https://doi.org/10.1109/CVPR.2019.01082
Kauppi T, Kalesnykiene V, Kamarainen J-K, Lensu L, Sorri I, Pietila J et al (2007) DIARETDB1-standard diabetic retino-pathy database. IMAGERET-Optimal Detect Decis Diagnosis Diabet Retin 15:1–1510
Kaggle 1000 fundus images with 39 categories [Online]. Available: https://www.kaggle.com/linchundan/fundusimage1000
Holm S, Russell G, Nourrit V, McLoughlin N (2017) DR HAGIS-a fundus image database for the automatic extraction of retinal surface vessels from diabetic patients. J Med Imaging 4:014503–014503. https://doi.org/10.1117/1.JMI.4.1.014503
Yoo TK, Choi JY, Kim HK (2020) CycleGAN-based deep learning technique for artifact reduction in fundus photography. Graefe's Archive for Clinical and Experimental Ophthalmology. 258:1631–1637. https://doi.org/10.1007/s00417-020-04709-5
Zhang Z, Yin FS, Liu J, Wong WK, Tan NM, Lee BH et al (2010) Origa-light: An online retinal fundus image database for glaucoma analysis and research. in 2010 Annual international conference of the IEEE engineering in medicine and biology. pp. 3065–3068
Diaz-Pinto A, Morales S, Naranjo V, Köhler T, Mossi JM, Navea A (2019) CNNs for automatic glaucoma assessment using fundus images: an extensive validation. Biomed Eng Online 18:1–19. https://doi.org/10.1186/s12938-019-0649-y
Carmona EJ, Rincón M, García-Feijoó J, Martínez-de-la-Casa JM (2008) Identification of the optic nerve head with genetic algorithms. Artif Intell Med 43:243–259. https://doi.org/10.1016/j.artmed.2008.04.005
Orlando JI, Fu H, Breda JB, Van Keer K, Bathula DR, Diaz-Pinto A et al (2020) Refuge challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med Image Anal 59:101570. https://doi.org/10.1016/j.media.2019.101570
Decencière E, Zhang X, Cazuguel G, Lay B, Cochener B, Trone C et al (2014) Feedback on a publicly distributed image database: the Messidor database. Image Anal Stereology 33:231–234. https://doi.org/10.5566/ias.1155
Odstrcilik J, Kolar R, Budai A, Hornegger J, Jan J, Gazarek J et al (2013) Retinal vessel segmentation by improved matched filtering: evaluation on a new high-resolution fundus image database. IET Image Proc 7:373–383. https://doi.org/10.1049/iet-ipr.2012.0455
Hossin M, Sulaiman MN (2015) A review on evaluation metrics for data classification evaluations. Int J data Min Knowl Manage process 5:1. https://doi.org/10.5121/ijdkp.2015.5201
Vora DR, Rajamani K (2019) A hybrid classification model for prediction of academic performance of students: a big data application. Evol Intell no 0123456789. https://doi.org/10.1007/s12065-019-00303-9

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Automated Glaucoma Detection Techniques: an Article Review

Status:

Version 1

Abstract

Figures

1. Introduction

2. Methodology

3. Related Work

3.1. Databases

3.2. Evaluation Criteria

4. Discussion

5. Conclusions

Declarations

Author Contribution

References

Additional Declarations

Status:

Version 1