PDF  PubReader

Jung and Kim*: Feature Extraction of Non-proliferative Diabetic Retinopathy Using Faster R-CNN and Automatic Severity Classification System Using Random Forest Method

Younghoon Jung and Daewon Kim*

Feature Extraction of Non-proliferative Diabetic Retinopathy Using Faster R-CNN and Automatic Severity Classification System Using Random Forest Method

Abstract: Non-proliferative diabetic retinopathy is a representative complication of diabetic patients and is known to be a major cause of impaired vision and blindness. There has been ongoing research on automatic detection of diabetic retinopathy, however, there is also a growing need for research on an automatic severity classification system. This study proposes an automatic detection system for pathological symptoms of diabetic retinopathy such as microaneurysms, retinal hemorrhage, and hard exudate by applying the Faster R-CNN technique. An automatic severity classification system was devised by training and testing a Random Forest classifier based on the data obtained through preprocessing of detected features. An experiment of classifying 228 test fundus images with the proposed classification system showed 97.8% accuracy.

Keywords: Faster R-CNN , Classification , Machine Learning , Non-proliferative Diabetic Retinopathy , Random Forest

1. Introduction

Diabetic retinopathy (DR) is a representative microvascular complication in diabetic patients and a major cause of impaired vision and blindness in worldwide. This disease occurs in approximately 60% of diabetic patients [1]. Early treatment through fundus examination can lower prevalence but the discovery and treatment tend to be delayed because there is no symptom perceptible by the patient in the early stage. For non-proliferative diabetic retinopathy (NPDR), which is an early stage of DR, pathological symptoms can be observed such as microaneurysms, hard or soft exudate, and retinal hemorrhage. Microaneurysm is a clinical finding that can be observed first in DR and then is abnormally expanded with the progress of DR. In severe cases, it may develop to proliferative diabetic retinopathy (PDR) [2]. In the case of retinal hemorrhage, the blood vessels in the retina can burst, causing blurred or impaired vision. In the case of hard exudate, intestinal juice leaks from the retinal blood vessels and leaves lipids. The blood cholesterol level can be estimated through the hard exudate. The Early Treatment Diabetic Retinopathy Study (ETDRS) [3] classified DR into PDR and NPDR, and subdivided NPDR into no apparent retinopathy, mild NPDR, moderate NPDR, severe NPDR, and PDR. In ophthalmology, different treatments are performed according to the grade under this classification system. It is necessary to classify the severity of the patient's retinal condition because each treatment method is different [4]. There is a growing need for research to improve specificity and sensitivity for early diagnosis and classification of DR. Many studies have been done to detect and classify the pathological symptoms of DR using various medical image processing and machine learning techniques. Representative examples are studies on medical image analysis using the convolutional neural network (CNN), which are deep learning artificial intelligence techniques based on the support vector machine (SVM) [5] and big data. Recently, Google conducted a study on DR based on deep learning [6] and there have been many researches on the severity classification of DR using the CNN. Pratt et al. [7] used more than 80,000 6M pixel images provided by Kaggle [8]. Studies on the diagnosis and classification of DR are being conducted continuously and more research and investment are required in the development of an automatic classification system for the disease. Das et al. [9] proposed a deep learning architecture which is based on segmented fundus image features for classification of DR showing maximum of 98.7% accuracy and used a DR dataset, DIARETDB1. Padmanayana and Anoop [10] researched on binary classification of DR using the CNN with fundus color images which are the same form of datasets used in this research and provided 94.6% of testing accuracy. Unlike the features used in this study, there are cases where texture features are used to classify various severity grades of DR through deep learning [11]. They used ResNet, DenseNet, and DetNet, which showed the highest accuracy of 96.35%. Hathwar and Srinivasa [12] studied a method for automatically grading the DR status present in the retinal fundus image using deep learning and showed sensitivity of 94.3% and specificity of 95.5% which are very similar to this study. There is also a study [13] that classified the stages of DR using the same Messidor [14] data used in this study. In that study, several types of enhanced deep learning method were used, mainly AlexNet, VGG, GoogleNet, and ResNet, and showed the accuracy of 99.66%. In some cases, a study [15] was conducted to automatically grade DR using ResNet. As a result of using various networks, the accuracy of up to 86.67% was shown and the Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset was used. In the study [16] of classifying DR using the R-CNN, which is one of deep learning methods, the accuracy was 93% and it showed superiority of 7.4% and 37.83%, respectively, compared to the performances with the SVM or k-nearest neighbor (KNN). Also, there is a study [17] that classified the severity of DR using the Kaggle dataset. Here, the binary CNN was proposed and after comparing the performance using various existing deep learning networks, it showed an accuracy of 91.04%. In addition, there have been many studies to classify the severity of DR using deep learning. There are also papers [18-21] published after collecting the used structure and results of various methods in one place. This study detected microaneurysms, retinal hemorrhage, and hard exudate, which are the early-stage features of NPDR from fundus images using Faster R-CNN [22], and instantly diagnosed fundus condition by analyzing the pathological features using a classifier. The proposed system is performed mainly in two steps: detection step and classification step with diagnosis. Finally, a Random Forest classifier [23] was designed to classify severity by analyzing the ratio of pathological symptoms such as microaneurysms and retinal hemorrhage.

2. Diagnosis of Diabetic Retinopathy using Faster R-CNN and Random Forest Method

This study designed a system that can quickly detect the pathological features of DR and classify severity grades regardless of brightness, contrast, and color tone of the fundus imaging system using the Faster R-CNN and Random Forest methods.

First the pathological features of DR such as microaneurysms, retinal hemorrhage, and hard exudate are detected using Faster R-CNN method. In the second step, the retinal condition of DR is diagnosed using a Random Forest algorithm resulting in the severity grade classification. Fig. 1 shows the overall structure of the DR detection and automatic classification system.

Fig. 1.

Structure of DR diagnosis system using Faster R-CNN and Random Forest.
2.1 Detecting the Features of Diabetic Retinopathy using Faster R-CNN

Labeling work is performed to detect the features of DR. Microaneurysms and retinal hemorrhage are classified as Label 0 and hard exudate is classified as Label 1. The DR data for learning are in XML file format, which consists of four variables: path of the image file, total size of image, and the names and the coordinates of objects to learn. Examples of microaneurysms, retinal hemorrhage and hard exudate are shown in Fig. 2.

Fig. 2.

Examples of pathological symptoms of DR: (a) microaneurysms, (b) retinal hemorrhage, and (c) hard exudate.

For the pre-training network model, the Inception-ResNet-v2 [24] was used, which consists of multiple layers in the convolution step of CNN, and which has the advantages of reducing the calculation amount and improving the accuracy of complex convolutions. Then the learning procedure to use Faster R-CNN for detecting the pathological symptoms of DR is performed. Anchors were randomly created for 16 mini-batch sizes and the region proposal network (RPN) was trained for the number of repetitions. Based on the target regions created by the training of the RPN, Faster R-CNN is trained for the number of repetitions using the Inception-ResNet-v2, which is a pre-trained network and learning structure. Then, the training data of the convolution layer are shared based on the trained Faster R-CNN and the RPN is trained for the number of repetitions. The training is completed through fine-tuning of the shared convolution layer to make it advantageous for detecting objects. The detection method extracts a feature map from the pre-trained model and then hands it over to the RPN and region-of-interest (RoI) pooling layer. In the RoI pooling layer, the class scores and box coordinates for the detected objects are obtained. The structure and procedure of the Faster R-CNN used to detect the features of DR are shown in Fig. 3.

Fig. 3.

Structure of Faster R-CNN used to detect the pathological symptoms of DR.

Detected images were created by recognizing the learned pathological symptoms and inserting white boxes at the coordinates of the parts where pathological symptoms were detected in the original fundus images. The results calculated from the Faster R-CNN consist of the Pixel Values of the corresponding region based on the Coordinates of the region, Class Type determined through the labeling work, and Class Score which is a probability of belonging to the class.

2.2 Feature Information Preprocessing Step

The data refining and preprocessing steps are performed to detect microaneurysms, retinal hemorrhage, and hard exudate, which are the features of DR from the input fundus images, and to transmit the feature information to the Random Forest classifier. For the microaneurysms and retinal hemorrhage, the pixel values for the corresponding objects are transformed to grayscale as in Eq. (1). This is done so that the pixel values can be expressed as brightness values of 256 steps. Then, the pixels of each region are reverse-transformed using Eq. (2) and the histogram equalization process of Eq. (3) is applied. After an even distribution of brightness values is created and the boundaries are clearly distinguished in this way, the background region, microaneurysms, and retinal hemorrhage regions are separated. Next, the pixels are extracted as binary data using basic thresholding method as shown in Eq. (4) and the number of pixels of the hemorrhage region is derived. Fig. 4 shows the overall preprocessing steps from the detection of pathological symptoms.

Fig. 4.

Preprocessing steps for detected information: (a) original image of retinal hemorrhage, (b) grayscale transformation, (c) histogram smoothing, and (d) binary image.

[TeX:] $$g=(0.2126 \times R)+(0.7152 \times G)+(0.0722 \times B)$$

[TeX:] $$x^{\prime}=|(255-x)|$$

[TeX:] $$h(v)=R\left(\frac{c d f(v)-c d f_{\min }}{(M \times N)-c d f_{\min }} \times(L-1)\right)$$

[TeX:] $$b(j, i)=\left\{\begin{array}{l} 1, f(j, i) \geq T \\ 0, f(j, i)<T \end{array}\right.$$

To analyze the distribution of retinal hemorrhage and microaneurysms in the fundus images, the X and Y coordinates are derived based on the boxes of the extracted objects and the sum of the distances be¬tween objects is calculated as shown in Eq. (5). If one or fewer objects are detected, it is indicated as 1.

[TeX:] $$\bar{D}=\sum_2^n \sqrt{\left(X_n-X_{n-1}\right)+\left(Y_n-Y_{n-1}\right)}$$

Table 1 shows the feature data transmitted to the Random Forest classifier through the preprocessing step which are used to classify the severity of DR. These features input to the classifier consist of the number of pixels occupied by microaneurysms and retinal hemorrhage, the maximum distance between each disease object, class type and class scores obtained from the Faster R-CNN.

Table 1.

Data transmitted to the classifier after preprocessing
No. Transmitted Data
1 Number of pixels for microaneurysms and retinal hemorrhage
2 Maximum distance between each objects
3 Class type
4 Class scores
2.3 Diabetic Retinopathy Diagnosis and Classification System

The severity grades were classified by analyzing the ratios of the regions occupied by microaneurysms, retinal hemorrhage, and hard exudate in the entire retina for the input fundus images using the DR classification criteria of ETDRS. First, to calculate the area of the regions of pathological symptoms, the background region is removed from the input fundus images. Based on the information transmitted from the Faster R-CNN algorithm, the ratio of the total fundus region (C) in the retina is calculated with Eq. (6) using the pixel information of the detected microaneurysms and retinal hemorrhage (A), and hard exudate (B).

[TeX:] $$\alpha=(A+B) / C$$

If the ratio [TeX:] $$\alpha$$ in the retina is lower than 0.258, it is classified as mild grade, and if it is higher than 0.258, the conditions are classified into moderate or severe grades. Here, the α value of 0.258 was derived by analyzing the distribution chart of the ratios of pathological symptoms by severity grade for 175 fundus images that were determined as moderate or severe grade by specialists of DR. This represents the criterion for separation is classified into mild, moderate, and severe grades. The whole algorithm that describes the classification method for the severity grades of DR by detecting pathological symptoms in input images and calculating the ratios of pathological symptoms is shown in Fig. 5.

Fig. 5.

Severity grade classification method by ratio of pathological symptoms in the retina.

This research also used the Random Forest technique based on supervised learning which is a representative machine learning algorithm. The Random Forest algorithm is designed to avoid the over-fitting and under-fitting phenomena of decision tree. It learns the decision tree by randomly extracting some variables of the features from the dataset and creates a classifier, thus using an ensemble learning method that combines multiple models. Fig. 6 shows the result of data importance analysis by comparing the average of the information gain that each tree obtained from the set of decision trees comprising the Random Forest method. This table shows that the Random Forest classifier analyzed the number of pixels representing the disease area, the maximum distance between diseased objects, class type and class scores with 45%, 24%, 21%, and 10% of importance, respectively. Then the Fig. 7 shows the flowchart for the severity grade classification method of DR using the Faster R-CNN and Random Forest classifier.

The overall sequence consists of detecting the pathological symptoms of DR using Faster R-CNN and classifying the severity grade using Random Forest classifier. The probability of pathological symptoms and the coordinates of the box are determined by applying Faster R-CNN to the input fundus images. Then, the exact number of hemorrhage pixels included in the box through the data preprocessing step and the distance values of the distribution of pathological symptoms are transmitted to the Random Forest classifier. Finally, the classifier analyzes the refined data and classifies the severity grade of DR.

Fig. 6.

Importance of data features for Random Forest classification learning.

Fig. 7.

Severity classification method of DR using Faster R-CNN and Random Forest.

3. Experiment and Results

The experiment was conducted using CUDA toolkit 10.2 in the CAFFE environment of TensorFlow installed in the Windows 10 operating systems and used OpenCV 4.0 for data preprocessing. Table 2 shows the hardware configuration, used language, tools and development environment for this experiment. The data used for learning and evaluating the algorithm that was proposed in this study were the Messidor dataset [14] and fundus images provided by the Dankook University Medical Center (DUMC).

Table 2.

Development environment for the experiment
CPU, RAM Intel i7-10700 @ 4.80 GHz, 32 GB
GPU NVIDIA GeForce GTX 3070 16 GB
OS Windows 10 64-bit Education version
Language Python 3.7, TensorFlow GPU 2.4.0
Tools CAFFE, CUDA 11.1.1, OpenCV 4.0

Fig. 8.

Fundus images used for learning and evaluation: (a) Messidor data and (b) DUMC data.

The Messidor dataset consisted of 653 fundus images that had been classified in terms of severity. The data received from DUMC consisted of 228 fundus images that ophthalmologists had classified for severity. Fig. 8 shows sample images of Messidor and the DUMC dataset and the data were classified by the severity grade of DR as shown in Table 3.

For the Messidor data, 153 mild images, 247 moderate images, and 253 severe images were used as the learning data set for Faster R-CNN. Furthermore, 1,760 learning data were used that consisted of 767 data for microaneurysms, 649 data for retinal hemorrhage, and 344 data for hard exudate. To evaluate the classification system, 228 image data received from DUMC were used, which consisted of 53 mild images, 62 moderate images, and 113 severe images.

Table 3.

Fundus images data classification used in the experiment
Training Testing Total
Messidor DUMC
Number of images 653 228 881
Mild 153 53 206
Moderate 247 62 309
Severe 253 113 366
3.2 Experiment using the Classification System

Every experiment used the Inception-ResNet-v2 model of the Faster R-CNN with the same dataset. An experiment for extracting the features of DR and for classifying severity were performed. The classification algorithms used a method based on the area ratio of the pathological symptoms and the Random Forest classifier. Fig. 9 shows the network model structure involved in the block diagram of the Faster R-CNN method.

Fig. 9.

Inception-Resnet-v2 model network structure used in experiment with the Faster R-CNN.

After the DR fundus image is input to the Inception-ResNet-v2 network, it is transferred to the RPN. At this stage, the object region in the input image is identified. Then the classifier block and object classifier of the bounding box regressor play a role in finding the appropriate box candidates from the RPN output. After that, it goes through ROI pooling and a Fully connected layers. Finally, each target object's bounding box and its category label are exported. The initial learning rate was started at 0.0001 and a total of 12,500 training epochs were run. Training was stopped when the value of the multi-task loss function was minimized. The model corresponding to the minimum loss value was used in the testing stage. To verify the performance, both the plain CNN model and the SVM were used as control methods. For evaluation, true positive (TP) was defined as classifying mild fundus images as mild; true negative (TN) was defined as classifying moderate or severe fundus images as moderate or severe; false positive (FP) was defined as classifying moderate or severe fundus images as mild; false negative (FN) was defined as classifying mild fundus images as moderate or severe. Sensitivity, specificity, and accuracy were defined by Eq. (7), (8), and (9), respectively, and used to analyze and evaluate the classification results.

[TeX:] $$\text { Sensitivity }=\frac{T P}{(T P+F N)} \times 100$$

[TeX:] $$\text { Specificity }=\frac{T N}{(T N+F P)} \times 100$$

[TeX:] $$\text { Accuracy }=\frac{(T P+T N)}{(T P+F P+T N+F P)} \times 100$$

3.2.1 Classification experiment based on the area ratio of pathological symptoms

In this experiment, the pathological symptoms of DR were detected and their severity grades were classified based on the area ratio of the pathological symptoms in the retina. Fig. 10 shows the detection result of pathological symptoms using the Faster R-CNN with the moderate fundus images of DR. In addition, the results are shown for the areas of hard exudate and retinal hemorrhage.

Fig. 10.

Detection of pathological symptoms using Faster R-CNN: (a) input image and (b) detected results.

Table 4.

Detection result for pathological symptoms using Faster R-CNN
File name Ratio of exudate (%) Ratio of hemorrhage (%) Sum (%)
001.jpg 0 0 0
006.jpg 0.017 0 0.017
016.jpg 0.026 0 0.026
005.jpg 0.041 0 0.041
002.jpg 0.058 0 0.058
046.jpg 0 0.083 0.083
011.jpg 0.104 0 0.104
007.jpg 0.184 0 0.184
071.jpg 0.022 0.193 0.215
070.jpg 0.005 0.258 0.263
087.jpg 0.166 0.121 0.287
088.jpg 0.126 0.193 0.319
056.jpg 0.265 0.123 0.389
094.jpg 0.417 0 0.417
025.jpg 0.116 0.343 0.460
003.jpg 0.505 0 0.505
086.jpg 0.118 0.411 0.529
076.jpg 0.108 0.437 0.545
111.jpg 0.430 0.205 0.636
078.jpg 0.434 0.245 0.680
090.jpg 0.581 0.210 0.792
100.jpg 0.137 0.709 0.847
037.jpg 0 0.855 0.855
063.jpg 0.531 0.423 0.955
023.jpg 0.734 0.235 0.970
024.jpg 0.566 0.410 0.976
036.jpg 0.788 0.208 0.997
058.jpg 0.320 0.696 1.017
051.jpg 0.681 0.437 1.118
030.jpg 0.813 0.611 1.425
075.jpg 1.335 0.126 1.461
079.jpg 0.201 1.533 1.734
055.jpg 0.312 1.522 1.834
059.jpg 0.423 1.486 1.909
089.jpg 1.634 0.537 2.171
054.jpg 1.439 1.084 2.524
091.jpg 2.588 0 2.588
039.jpg 2.461 0.352 2.814
082.jpg 1.775 1.088 2.864
044.jpg 2.152 0.822 2.975
113.jpg 2.578 0.437 3.015
095.jpg 1.979 1.233 3.212
105.jpg 2.801 0.558 3.359
098.jpg 2.280 1.114 3.394
053.jpg 0.586 2.905 3.492
050.jpg 3.525 0.386 3.912
043.jpg 1.341 2.645 3.987
099.jpg 3.712 0.715 4.428
106.jpg 3.024 2.030 5.054
102.jpg 5.098 0.368 5.466
040.jpg 5.541 0 5.541
096.jpg 3.875 2.130 6.006

Table 4 shows the ratios of pathological symptoms detected in the part of 228 fundus images from which the correlation between the fundus grade and the ratio of pathological symptoms can be confirmed. It can be seen that the ratios of hard exudate and retinal hemorrhage are low in the early stage of DR, but the ratios increase as the disease worsens. The severity grades were classified based on the ratios of pathological symptoms.

Table 5 shows the results of sensitivity, specificity, and accuracy calculated by inserting the classifi¬cation results into the confusion matrix. As shown in Table 5, 156 TN images were classified accurately. However, classifying TP images was difficult because this included fundus images that could not be classified using the ratio of pathological symptoms in the data. The detection accuracy was 92.1% indicating a good result. However, the low sensitivity result of 78.26% compared to the specificity result of 98.11% suggests that classifying the severity grades of DR using just the area ratio of pathological symptoms of the retinas in the fundus images has limitations.

Table 5.

Confusion matrix and the result of evaluation indices
Result of classification Result of analysis (%)
TP TN FP FN Sensitivity Specificity Accuracy
54 156 3 15 78.26 98.11 92.1
3.2.2 Classification using the Random Forest method

In this experiment, the severity grade of DR was classified using the Random Forest method based on the data extracted through the Faster R-CNN. First, the data of pathological symptoms were detected using the Faster R-CNN, as shown in Fig. 11(a).

Fig. 11.

(a) Detection result for pathological symptoms using Faster R-CNN and (b) preprocessing of (a).

The data preprocessing step was performed to extract the area of the pathological symptoms and red lesions that were detected using the Faster R-CNN. Fig. 11(b) shows the resultant image of preprocessing performed for the image of pathological symptoms in Fig. 11(a). The data after preprocessing were used to calculate the number of pixels and area of the hemorrhage lesions. Then the maximum distance, class, and score of each object obtained from the Faster R-CNN were transmitted to the Random Forest classifier. The feature data were adjusted to values in the range of zero through one. Then, the total size of decision trees was adjusted considering the limited memory area. Also, the overall size of the decision tree was appropriately adjusted in the training stage to prevent overfitting and consequently it was reduced to create a stable model while observing performance changes. Table 6 shows a confusion matrix that summarizes the classification results.

Table 6 shows that out of the 228 total test images, 58 and 165 images were classified as TP and TN, respectively. Compared to the classification result based on the area ratio of pathological symptoms in the retina in the previous experiment, the FN result was decreased from 15 to 3. This demonstrates that the use of a classifier based on the Random Forest method and the preprocessing of data detected through Faster R-CNN can derive excellent experimental results.

Table 6.

Confusion matrix and evaluation index results
Result of classification Result of analysis (%)
TP TN FP FN Sensitivity Specificity Accuracy
58 165 2 3 95.08 98.8 97.8
3.2.3 Comparison of classification methods

In order to evaluate the performance of the classifier as proposed in this study, the SVM and the CNN were selected as the algorithms to compare and they were evaluated using the same dataset. The experiment was performed with a linear model for the kernel of the SVM. After preprocessing with the same data as those used in the Random Forest classifier, the kernel size was adjusted and the model with the highest accuracy was used for the evaluation. For the CNN used in this comparison, convolution layers are increased to allow the network to learn deeper features. The network starts with convolution blocks with activation and then batch normalization after each convolution layer. All max-pooling is performed with kernel size 3×3 and 2×2 strides. The ReLu was used as an activation function and L2 regularization was used for weight and biases. The network was also initialized with Gaussian initialization to reduce initial training time. The loss function used to optimize was the widely used categorical cross-entropy function. The Messidor data were used to train the CNN and the experiment was performed for the 228 image data from the DUMC. Table 7 outlines the comparison and analysis for the results.

Table 7.

Confusion matrix and performance evaluation results for the comparison of classifiers
Result of classification Result of analysis (%)
TP TN FP FN Sensitivity Specificity Accuracy
SVM 56 157 4 11 83.58 97.51 93.42
CNN 58 159 5 6 90.62 96.95 95.17
Proposed algorithm 58 165 2 3 95.08 98.80 97.80

The classification result using the SVM showed higher performance for the classification of TP, TN, FP, respectively compared to the results based on the area ratio of pathological symptoms in the retina. However, it classified four images as FP, suggesting that the performance improvement of the SVM algorithm should be considered. The SVM showed 1.3% superior performance than the results from the method using area ratio in a view of accuracy. The classification results using CNN showed 96.95% performance for specificity and a performance of 90.62% for sensitivity. It also showed 1.75% better result than that from the SVM. The proposed method which used the Faster R-CNN and the Random Forest classifier showed the best result of 97.8% in accuracy. Furthermore, an examination of the mild fundus images that had failed to be classified through SVM revealed that the retinal hemorrhage in a patient’s fundus image was similar to the one from a patient in whom retinal hemorrhage rarely occurred. Thus, failure to appropriately classify these data seems one of the reasons for the failure of total classification. The plain CNN also showed good results even though it did not analyze the pathological features of DR and it could overcome using relatively enough data. Table 8 shows another comparison results of performances with other existing classifiers.

The performances of other existing methods are also evaluated with various evaluation indicators including accuracy. Sudarmadji et al. [13] showed the highest performance with 99.66%. The dataset used by each research group are Messidor [14], Kaggle [8], and IDRiD [15] etc. The study result of Hathwar and Srinivasa [12] showed that the kappa value was 0.88, indicating that the classification result was almost perfectly consistent. The study result of Ghan et al. [16] showed the F1-score of 92% with the accuracy of 93%. Most of the studies have been conducted in the form of classifying the normal and abnormal DR and classifying the severe grades in abnormal condition. Each study performed the classification into two to five classes. The results in Table 8 show that each used dataset, deep learning network architecture, and evaluation index were comparatively evaluated under different conditions and the accuracy of this research also shows good results when compared with those methods.

Table 8.

Performance comparison results with other existing classification methods
Study Classes Dataset Best architecture Accuracy (%) Sensitivity (%) Specificity (%) Precision (%) Recall (%)
Das et al. [9] 2-class DIARETDB1 Custom CNN 98.7 - 98.2 97.2 99.6
Padmanayana and Anoop [10] 2-class Kaggle Custom CNN 94.6 86 96 - -
Adriman et al. [11] 2-class APTOS2019 ResNet-34 96.35 - - - -
Hathwar and Srinivasa [12] 5-class EyePACS & IDRiD Xception-TL - 94.3 95.5 - 0.88 (kappa)
Sudarmadji et al. [13] 5-class



Custom CNN

Custom CNN









Elswah et al. [15] 4-class IDRiD ResNet-50 86.67 - - - -
Ghan et al. [16] 2-class IDRiD R-CNN 93 - 92 (F1-score) 86 92
Kolla and Venugopal [17] 2-class Kaggle Inception-v3 91.04 - - - -
Proposed algorithm 3-class Messidor Inception-ResNet-v2 97.8 95.08 98.8 - -

4. Conclusion

This study proposed a DR detection and severity grade classification system with high accuracy using the Faster R-CNN and the Random Forest method. An experiment on the correlation between pathological symptoms and the severity grade of DR found that a data preprocessing step was necessary for efficient classification. The training for the Faster R-CNN algorithm applied in this study extracted features about the pathological symptoms of DR using given data. Then, appropriate classification results could be derived by training the Random Forest classifier system using the features of the data and the advantages of machine learning. Those features are composed of number of pixels for microaneurysms and retinal hemorrhage, maximum distance between each objects, class type and class scores. The proposed classification method based on the Random Forest classifier was analyzed for comparison with existing classification methods such as the SVM and CNN. The results confirmed that the proposed method achieved better results than other methods in terms of performance evaluation indices including 97.8% of accuracy. If the similar research results introduced in Section 1 of this paper were listed in order, the accuracy was 98.7% [9], 94.6% [10], 96.35% [11], 99.66% [13], 86.67% [15], 93% [16], and 91.04% [17], respectively. It can be seen that the results of this study are definitely not far behind. If a large body of meaningful data can be acquired for training, it will greatly help the classification of moderate and severe grades. In the future, we plan to expand the proposed method and research the design and development of a system that can automatically classify detailed severity grades while promoting the fast and objective judgment of testers by enabling real-time classification in conjunction with a fundus imaging system. Therefore, future work will be focused on more accurately classifying according to the severity of DR by improving and strengthening the internal structure and network of deep learning algorithms after obtaining more clinical data and undergoing more effective image preprocessing.


This work was supported by the ICT R&D program of MSIT/IITP in Republic of Korea (No. 2018-0-00242, Development of AI ophthalmologic diagnosis and smart treatment platform based on big data).


Younghoon Jung

He received the B.S. (2017) from Dankook University, Yongin, Korea, and currently pursuing the M.S. in Department of Computers in graduate school of Dankook University. He worked as a graduate student researcher at Next Generation Terminals in Multimedia & SW Lab. His research interests include R-CNNs, deep learning, and neural networks.


Daewon Kim

He received the M.S. (1996) from the University of Southern California, Los Angeles, CA, USA, and the Ph.D. (2002) in Electrical and Computer Engineering from Iowa State University, Ames, IA, USA. He is currently a professor in Department of Applied Computer Engineering at Dankook University, Republic of Korea. His research interests include image and signal processing, deep learning, mobile applications and nondestructive evaluation.


  • 1 L. Qiao, Y . Zhu, and H. Zhou, "Diabetic retinopathy detection using prognosis of microaneurysm and early diagnosis system for non-proliferative diabetic retinopathy based on deep learning algorithms," IEEE Access, vol. 8, pp. 104292-104302, 2020.doi:[[[10.1109/access.2020.2993937]]]
  • 2 S. Chaudhary, J. Zaveri, and N. Becker, "Proliferative diabetic retinopathy (PDR)," Disease-a-Month, vol. 67, No. 5, article no. 101140, 2021. https://doi.org/10.1016/j.disamonth.2021.101140doi:[[[10.1016/j.disamonth.2021.101140]]]
  • 3 Early Treatment Diabetic Retinopathy Study Research Group, "Grading diabetic retinopathy from stereoscopic color fundus photographs: an extension of the modified Airlie House classification (ETDRS Report Number 10)," Ophthalmology, vol. 127, no. 4, pp. S99-S119, 2020.custom:[[[-]]]
  • 4 Y . Zhou, B. Wang, L. Huang, S. Cui, and L. Shao, "A benchmark for studying diabetic retinopathy: segmentation, grading, and transferability," IEEE Transactions on Medical Imaging, vol. 40, no. 3, pp. 818828, 2021.doi:[[[10.1109/tmi.2020.3037771]]]
  • 5 M. M. Abdelsalam and M. A. Zahran, "A novel approach of diabetic retinopathy early detection based on multifractal geometry analysis for OCTA macular images using support vector machine," IEEE Access, vol. 9, pp. 22844-22858, 2021.doi:[[[10.1109/access.2021.3054743]]]
  • 6 V . Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, et al., "Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs," JAMA, vol. 316, no. 22, pp. 2402-2410, 2016.doi:[[[10.1001/jama.2016.17216]]]
  • 7 H. Pratt, F. Coenen, D. M. Broadbent, S. P . Harding, and Y . Zheng, "Convolutional neural networks for diabetic retinopathy," Procedia Computer Science, vol. 90, pp. 200-205, 2016.doi:[[[10.1016/j.procs.2016.07.014]]]
  • 8 A. Tolkachev, I. Sirazitdinov, M. Kholiavchenko, T. Mustafaev, and B. Ibragimov, "Deep learning for diagnosis and segmentation of pneumothorax: the results on the Kaggle competition and validation against radiologists," IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 5, pp. 1660-1672, 2021.doi:[[[10.1109/jbhi.2020.3023476]]]
  • 9 S. Das, K. Kharbanda, M. Suchetha, R. Raman, and E. Dhas, "Deep learning architecture based on segmented fundus image features for classification of diabetic retinopathy," Biomedical Signal Processing and Control, vol. 68, article no. 102600, 2021. https://doi.org/10.1016/j.bspc.2021.102600doi:[[[10.1016/j.bspc.2021.102600]]]
  • 10 Padmanayana and B. K. Anoop, "Binary classification of DR-diabetic retinopathy using CNN with fundus colour images," Materials Today: Proceedings, vol. 58, pp. 212-216, 2022. https://doi.org/10.1016/j.matpr.2022.01.466doi:[[[10.1016/j.matpr.2022.01.466]]]
  • 11 R. Adriman, K. Muchtar, and N. Maulina, "Performance evaluation of binary classification of diabetic retinopathy through deep learning techniques using texture feature," Procedia Computer Science, vol. 179, pp. 88-94, 2021.doi:[[[10.1016/j.procs.2020.12.012]]]
  • 12 S. B. Hathwar and G. Srinivasa, "Automated grading of diabetic retinopathy in retinal fundus images using deep learning," in Proceedings of 2019 IEEE International Conference on Signal and Image Processing Applications (ICSIP A), Kuala Lumpur, Malaysia, 2019, pp. 73-77.doi:[[[10.1109/icsipa45851.2019.8977760]]]
  • 13 P . W. Sudarmadji, P . D. Pakan, and R. Y . Dillak, "Diabetic retinopathy stages classification using improved deep learning," in Proceedings of 2020 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), Jakarta, Indonesia, 2020, pp. 104-109.doi:[[[10.1109/icimcis51567.2020.9354281]]]
  • 14 G. Saxena, D. K. V erma, A. Paraye, A. Rajan, and A. Rawat, "Improved and robust deep learning agent for preliminary detection of diabetic retinopathy using public datasets," Intelligence-Based Medicine, vol. 3, article no. 100022, 2020. https://doi.org/10.1016/j.ibmed.2020.100022doi:[[[10.1016/j.ibmed.2020.100022]]]
  • 15 D. K. Elswah, A. A. Elnakib, and H. E. D. Moustafa, "Automated diabetic retinopathy grading using ResNet," in Proceedings of 2020 37th National Radio Science Conference (NRSC), Cairo, Egypt, 2020, pp. 248-254.doi:[[[10.1109/nrsc49500.2020.9235098]]]
  • 16 G. Ghan, S. Chavan, and A. Chaudhari, "Diabetic retinopathy classification using deep learning," in Proceedings of 2020 4th International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 2020, pp. 761-765.doi:[[[10.1109/icisc47916.2020.9171139]]]
  • 17 M. Kolla and T. V enugopal, "Efficient classification of diabetic retinopathy using binary CNN," in Proceedings of 2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates, 2021, pp. 244-247.doi:[[[10.1109/iccike51210.2021.9410719]]]
  • 18 S. V alarmathi and R. Vijayabhanu, "A survey on diabetic retinopathy disease detection and classification using deep learning techniques," in Proceedings of 2021 7th International Conference on Bio Signals, Images, and Instrumentation (ICBSII), Chennai, India, 2021, pp. 1-4.doi:[[[10.1109/icbsii51839.2021.9445163]]]
  • 19 S. Sooraj and M. Bedeeuzzaman, "Automatic classification of diabetic retinopathy based on deep learning: a review," in Proceedings of 2020 International Conference on Futuristic Technologies in Control Systems & Renewable Energy (ICFCR), Malappuram, India, 2020, pp. 1-5.custom:[[[-]]]
  • 20 M. Z. Atwany, A. H. Sahyoun, and M. Y aqub, "Deep learning techniques for diabetic retinopathy classification: a survey," IEEE Access, vol. 10, pp. 28642-28655, 2022.doi:[[[10.1109/access.2022.3157632]]]
  • 21 N. Tsiknakis, D. Theodoropoulos, G. Manikis, E. Ktistakis, O. Boutsora, A. Berto, et al., "Deep learning for diabetic retinopathy detection and classification based on fundus images: a review," Computers in Biology and Medicine, vol. 135, article no. 104599, 2021. https://doi.org/10.1016/j.compbiomed.2021.104599doi:[[[10.1016/j.compbiomed..104599]]]
  • 22 S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, vol. 28, pp. 91-99, 2015.doi:[[[10.1109/tpami.2016.2577031]]]
  • 23 W. Cao, N. Czarnek, J. Shan, and L. Li, "Microaneurysm detection using principal component analysis and machine learning methods," IEEE Transactions on Nanobioscience, vol. 17, no. 3, pp. 191-198, 2018.doi:[[[10.1109/tnb.2018.2840084]]]
  • 24 C. Szegedy, S. Ioffe, V . V anhoucke, and A. A. Alemi, "Inception-v4, Inception-ResNet and the impact of residual connections on learning," in Proceedings of the 21st AAAI Conference on Artificial Intelligence (AAAI), San Francisco, CA, 2017, pp. 4278-4284.doi:[[[10.1609/aaai.v31i1.11231]]]