This study proposes two approaches to designing the DR grading system: a standard operation procedure (SOP) for preprocessing the fundus image, and a revised structure of ResNet-50, which is described in the following subsection. Finally, this DR grading system is implemented in a website, to allow users to check a fundus image by themselves. Figure 1 shows the flowchart of the proposed DR grading system.
A. Dataset
In order to verify the accuracy that a deep learning system will achieve with a DR dataset, the concept of training, validation, and testing must be applied. The DR dataset from Kaggle (https://www.kaggle.com/competitions/aptos2019-blindness-detection/data) includes 35,126 fundus images, of which 25,805 are normal (without disease). Only 9,321 fundus images exhibit DR, which is divided into four stages [10]: mild nonproliferative diabetic retinopathy (NPDR), moderate NPDP, severe NPDP, and proliferative diabetic retinopathy (PDR). The imbalanced proportion of normal and DR images in Big Data has been identified as one of the main challenges for the algorithms. This can commonly cause overfitting problems [11], as there is a high performance of DR grading in training data, but low performance in the testing data.
Figure 2 shows the method used to select the training data from the 35,126 fundus images. In addition, the validation and testing data also follow a similar method to select 300 fundus images individually, which are different from any images in the training data.
B. SOP for fundus image preprocessing
Preprocessing methods are a very important stage in image recognition; they can be used to eliminate noise/variation in the retinal fundus image, and improve the quality and contrast of the image. Consequently, the trained modules can obtain more credible and accurate results. Therefore, the proposed SOPs (Fig. 1 step 1) for preprocessing fundus images in this study are introduced respectively.
1) Remove the black border of the fundus image:
Many types of fundus image are obtained from Kaggle, due to different types of fundus photography equipment and environments. For instance, the black border of the fundus image (Fig. 3 (a)) would affect the performance of DR grading. This study adopts the auto-cropping method [12] to crop out the uninformative black areas, as shown in Fig. 3 (b). The auto-cropping methods, using crop_image_from_gray functions [12], are performed as follows:
-
Convert this image (RGB format) to Gray format using OpenCV library. The value of a pixel is 255 when the color of the image is white; if the color is black, the value of the pixel is 0.
-
Produce the clipping mask which contains 0 and 1 values. When the value of a pixel > tolerance, the value is 1 (True). When the value of pixel ≦ tolerance, the value of mask is 0 (False), as shown in Fig. 3 (c). Default tolerance is 7.
-
Find a rectangular area which includes column and row elements with 1 values (red square in Fig. 3 (c)).
-
Extract the rectangular area from the image (RGB format).
2) Create a circular crop around the center of the fundus image: After removing the black border of the fundus image, some parts of the information are also removed, as shown in Fig. 3 (b), and the form of the fundus image is not circular. Even if we resize the fundus image (Fig. 3 (b)), the fundus will be deformed. In order to create a circular crop around the imaged center of the fundus image, as shown in Fig. 3 (d), this study adopts the following processing methods:
-
Find the height (H) and width (W) of the fundus image (H*W).
-
Find the longest side (L), either the height or width.
-
Resize the fundus image (Fig. 3 (b)) (L*L).
-
Produce the circular image. There is circular mask with radius (L/2) at center, where the value of the mask is one. Values outside the circular mask are zero.
-
Combine the fundus image (Fig. 3 (b)) with the circular image using cv2.bitwise_and (OpenCV).
-
Remove the black border of the fundus image again, as mentioned above.
3) Assess quality of the fundus image: In order to obtain the most important features from fundus images, this study adopts the Eye-Quality library [13, 14] to assess the image quality with three labels: reject, usable, and good (Fig. 4). The Eye-Quality (EyeQ) library is developed from the EyePACS dataset (https://www.kaggle.com/c/diabetic-retinopathy-detection) to provide fundus image quality assessment, using a multiple color-space fusion network (MCF-Net) based on ResNet121. The dataset only includes usable and good quality fundus images, to train and test the performance of DR grading.
4) Equalize the histogram of the fundus image:
This study uses an equalized histogram of the fundus image, where the distribution of the image is changed to a uniform distribution, to enhance the contrast and make the features relatively clear. This study equalizes the histogram in Fig. 4, which is a good or usable image. An image in RBG format should be transferred to YCrCb or HSV format before equalizing the histogram. For YcrCb format, Y is the luma component, and Cr and Cb are the red-difference and blue-difference chroma components. HSV format is an alternative representation of the RGB color model, and has three components: hue, saturation, and value. This study transfers the image from RBG format (Fig. 5 (a)) to HSV format first, and then equalizes the histogram of hues and values of the fundus image [15] (Fig. 5 (b)).
C. Revised structure of ResNet-50
To solve the classification problems, many different types of ResNets are used, with different numbers of layers: specifically, 18, 34, 50, 101, and 152 layers [16]. The current deep learning framework for detecting and grading DR is ResNet-50 [8, 9]. However, the disadvantages of ResNet-50 are overfitting and fluctuations in accuracy, which affect its accuracy in detecting DR. This study proposes three strategies to improve the performance of ResNet-50, as follows:
1) Adaptive learning rate in ResNet-50: Learning rate is a particular issue in deep learning. A high learning rate causes weight updates that will be too large, and the performance of the model will oscillate over training epochs. A learning rate that is too low may never converge or may get stuck in the local solution. Thus, this study adopts the adaptive learning rate for ResNet-50, as follows:
-
Set learning rate (\(lr\)=0.01) and\(factor\left(0.5\right)\)
-
Set the low bound of \(lr\), where\(lr>0\)
-
If the performance of ResNet-50 fails to change every two epochs, the learning rate will be adjusted according to Eq. (1):
$${lr}^{{\prime }}=lr*factor$$
1
2) Regularization: Regularization can be employed to minimize the overfitting of the training model [17]. There are two common methods: L1 and L2 regularization. This study applies both L1 and L2 regularization, with kernel_regularizer, which applies a penalty on the layer’s kernel [18, 19], and activity_regularizer, which applies a penalty on the layer’s output [20].
3) Obtain suitable features from conv5_block1_out and conv5_block2_out in ResNet-50: A visualization tool can be applied to observe the features in different layers in ResNet-50 [21–23]. In the conv5_block1_out and conv5_block2_out layers, Fig. 6 (a), (b) shows the distinctive features, which indicate the bleeding part in red color. However, the bleeding part in red does not appear clearly in the final layer in ResNet-50 (Fig. 6 (c)). If the features of two layers could be combined (Fig. 6 (d)), the accuracy of DR grading should be improved. Therefore, this study performs different operations to combine the features of conv5_block1_out with those of conv5_block2_out.
D. Online DR grading system
This study adopts Python, html and JavaScript for web development. Functions include a web application framework, sitemap management, and web interactive design. The online DR grading system can be accessed by “post,” which is a request method supported by HTTP, used by the World Wide Web as a server. Thus, users can upload their fundus image through the online DR grading system, and the training model in the server can grade the image to evaluate whether DR is present. Then, the results will be returned and shown on the website. Figure 7 shows the flowchart of the online DR grading system.