1. Introduction
According to the Food and Agriculture Organization (FAO) reports, plant diseases cause an approximately US$220 billion loss in the global economy every year [1]. As one of the most common and popular fruits, apples can be made into delicious foods, which are welcomed by people worldwide. However, apple production struggles with various disease intrusions [2], which is not good news. To improve apple production, agricultural experts have made long-term efforts to fight against apple diseases. Most apple disease identification methods still require manual inspection, which not only requires considerable manpower and material resources, but also strongly depends on experiences. Since the emergence of machine learning technologies, an increasing number of researchers have begun to make use of machine learning technologies to prevent plants from being tainted. To date, apple leaf disease identification has been clearly divided into two main directions. One direction uses the traditional machine learning methods, while the other uses the naked eye and experience to extract apple disease features and identify them.
Chakraborty et al. [3] proposed image segmentation and a multiclass support vector machine (SVM) based apple leaf disease identification method. The authors first adopted image processing methods to segment the infected region of apple leaves and the SVM classifier was adopted to classify them. The experimental results had achieved a satisfactory performance. Ayyub and Manjramkar [4] proposed a multi-feature combined method for apple disease identification. They first extracted apple disease features, such as color, texture and shape and then input them into the proposed multiclass SVM for classification. The proposed method achieved up to 96% accuracy. James and Sujatha [5] proposed a hybrid neural clustering classifier for various apple fruit disease classifications. Two stages were adopted in this method. The k-means algorithm was used to cluster the vectors first, and the backpropagation (BP) neural network was adopted for classification. The proposed method had achieved 98% accuracy. Pandiyan et al. [6] proposed a heterogeneous Internet of Things (HIoT) based apple leaf disease classi¬fication method; the identification accuracy was as high as 97.35%.
The methods mentioned above achieved excellent performances by extracting features manually for identification. However, the comprehensiveness of the extracted features is limited, especially in large-scale planting environments. The emergence of deep learning technologies overcomes this issue, and deep learning models are good at processing large-scale data and learning feature representations automatically from large giving data [7]. In apple leaf disease identification field, many studies have been published. Jiang et al. [8] proposed an improved deep convolutional neural network (CNN) based real-time apple leaf disease detection approach. A deep CNN model was proposed by rainbow concatenation and Inception structure, and the accuracy achieved 78.80% mAP on apple leaf disease dataset (ALDD). Yu and Son [9] proposed an attention mechanism based deep learning method for leaf spot detection tasks. The authors designed two subnetworks for apple leaf spot disease identification. First, one subnetwork was used to separate the background from a whole leaf, and then the other subnetwork was used to classify them. The novel method could improve the accuracy by modeling the leaf spot attention, and outperformed the conventional deep learning models. Li and Rai [10] proposed a ResNet-18, SVM and VGG combined method for classification and comparison, and the ResNet model obtained a better classification effect than the others. Nagaraju et al. [11] analyzed apple leaf diseases such as black-measles, black-rot, and leaf-blight, and then proposed an improved VGG-16 model for identification. The proposed method achieved excellent performance. Agarwal et al. [12] proposed an FCNN-LDA method for apple leaf disease identification, which achieved a higher accuracy than the existing models. Tahir et al. [2] proposed a retrained Inception-v3 model via transfer learning for apple disease classification, and the method met 97% accuracy. Gargade and Khandekar [13] summarized the factors that affect leaf disease identification, and the conclusion showed that the machine learning algorithm can identify the features that cannot be found by the naked eye, which was also why machine learning was better than humans for identifying plant diseases. In addition to detecting apple leaf diseases, deep learning models have also achieved praiseworthy results in other plant disease identification [14-21].
All of the literature mentioned above has made outstanding contributions to identify apple and other plant diseases. However, there is still room for improvement in apple disease identification. In this article, a novel deep learning model named the improved deep residual network is proposed for apple disease identification by adjusting the parameters and weights. Additionally, a global residual connection is added to the original network, and the local residual connection architecture is optimized. The improved deep residual network for apple disease identification had achieved 99% validation accuracy and 98.74% top-1 test accuracy, which obviously better than the accuracies of the existing methods in apple leaf disease identification. The main contributions of this work are as follows:
· An improved deep residual network is proposed to identify apple leaf diseases.
· A global residual connection is adopted in the proposed network.
· The local residual connection architecture is optimized in this work.
This paper is organized as follows: Section 2 describes related works on apple leaf disease identification, and analyzes the popular deep learning models, as well as their merits and demerits. Section 3 details the proposed improved deep residual network. Section 4 analyzes and discusses the experimental results. The conclusion is presented in Section 5.
2. Related Works
Machine learning algorithms have accomplished meaningful achievements in various fields. Deep learning technologies have made great successes in image processing and apple leaf disease identification tasks. In the ImageNet competition [22], different models were presented, and new records were refreshed. Popular deep learning models were used such as AlexNet, Inception networks, deep residual networks (ResNet), etc.
2.1 AlexNet
As one of the most popular deep CNN models, AlexNet [23] has won many vision competitions such as the ImageNet and COCO competition [24]. On ImageNet datasets, the AlexNet had achieved more than 80% improvement over conventional machine learning methods such as k-means and support vector machines. This framework consisted of eight weighted layers and the normalization function was introduced to avoid the overfitting problem. Although it had unparalleled historical achievements, there were still shortcomings that could not be ignored. Gradient degradation might occur when one convolu¬tion layer was removed, which led to the gradient disappearing quickly. Thus, it was easy to overfit under the limited training data.
2.2 Inception Network
The deep CNNs mentioned before convolved one layer to the next layer. The outputs from the previous layer were as input into the next layer. Unlike the models mentioned above, the Inception network defined a novel architecture where it adopted different methods in convolution layers. It achieved wonderful results on ImageNet. The Inception network contained multiple kernels of convolution layers [25], which increased the width and improved the accuracy of classification. However, the model needed more computational resources by increasing the investment of hardware equipment to obtain excellent perfor¬mance.
2.3 Residual Network
As one of the most important and popular deep learning models, the ResNet achieved amazing performance on the COCO dataset for object detection. In the ImageNet detection competition, the ResNet models won ImageNet recognition and detection, COCO detection and segmentation competi¬tions [26]. As shown in Fig. 1, the residual connection was introduced in the original structure of the residual network.
The original structure of the residual network.
3. Proposed Method
3.1 Structure of the Improved Deep Residual Network
In this study, an improved deep residual network is proposed. In contrast from the previous network models, the residual connection in residual blocks can be defined as follows (1):
where [TeX:] $$x_{l+1}$$ denotes the residual block of the [TeX:] $$(l+1)^{t h}$$ layer, [TeX:] $$\chi_{l}$$ denotes the residual block of the [TeX:] $$l^{t h}$$ layer, and [TeX:] $$f\left(x_{l}+w_{l}\right)$$ denotes the part of the residual block. If the shapes of the feature maps in [TeX:] $$x_{l+1} \text { and } x_{l}$$ are different in the network, the dimension operation is needed. The residual connection block can be defined as follows (2):
where [TeX:] $$h\left(x_{l}\right)=w_{l}^{\prime} x, \text { and } w_{l}^{\prime} x$$ denotes the [TeX:] $$1 \times 1$$ convolution operation. The sum of the layers can be defined as L, the depth of the layers can be defined as l, the relationship of L and l is as follows:
where L denotes the sum of those shallower layers.
According to the chain rule of derivatives used in back propagation, the gradient of loss function [TeX:] $$\varepsilon$$ with respect to [TeX:] $$x_{l}$$ can be expressed as follows:
It can be seen from (4) that in the whole training process, [TeX:] $$\sum_{i=1}^{L-1} f\left(x_{i}, w_{i}\right)$$ cannot always be the value of -1, the residual network cannot appear, and the phenomenon of the gradient disappears. [TeX:] $$\frac{\partial \varepsilon}{\partial x_{L}}$$ can realize this function from L direct to any shallower layer l.
In this study, the residual connections are divided into global residual connections and local residual connections. In the global residual connection, the residual is connected between the input layer and the dense layer to prevent gradient disappearance. While in local residual connections, different from the original residual network, the proposed method makes a local-global connection. The input of one layer comes from the output of the layer that had concatenated. The proposed model structure is shown in Fig. 2.
Structure of the improved deep residual network.
3.2 Global Residual Connection
To maintain the ability of identity mapping, the global residual connection method is adopted in this study, which contains a convolution layer and three maxpooling operations. The filters of the convolution layer are 256 and the kernel size is 256, and the stride is (1, 1). The three maxpooling operations are used to concatenate to the dense layer for classification with the purpose of improving the feature represent ability.
3.3 Local Residual Connection
In the local residual connection block, the maxpooling operation is first used with a pooling size of 3 and a stride of (2, 2), which aims to reduce the dimension of the input data. There is a maxpooling operation and a convolution layer between each residual block for feature extraction. The purpose of local residual connections is to keep the tensor output from previous layers activated to maintain the network’s excellent training performance.
4. Experiment and Analysis
4.1 Experimental Environment and Data Acquisition
The experimental operating system is Window10 with 2*i7-9700 @3.00 GHz CPU, 16 GB memory, and an NVIDIA GeForce GTX 1650 GPU. The programming language is Python, and the TensorFlow framework is used with CUDA 10.1. In this study, apple leaf disease images are collected from the AI Challenger 2018 dataset [21], which contain 1,977 images with three categories. To compare with this model equitably, all of these models are trained and tested under the same dataset, which is shown in Table 1. Some apple leaf images are detailed in Fig. 3.
Dataset of apple leaves disease.
4.2 Training Details
In this study, the improved ResNet-50 is trained on an NVIDIA GeForce 1650 GPU. Before training, the dataset is split into three components by the train_test_split function: 60% for training, 20% for validation, and 20% for testing. In the optimization layer, adadelta and cross-entropy functions are adopted as the optimizer function and loss function respectively, and the initial learning rate is set as 0.0001. The batch size is set as 32 to feed the proposed mode for 200 epochs. The hyperparameters of the proposed method in the training process are detailed in Table 2.
4.3. Results and Analysis
4.3.1 Evaluation metrics
The top-1 accuracy is introduced as the evaluation metric, which is shown as follows:
where N denotes the number of samples, and C denotes the correct prediction samples.
4.3.2 Results
To ensure the fairness of different algorithms, the same dataset and experimental environments are used in this study. In this paper, some of start-of art models are selected such as SVM [28,29], k-means [30], AlexNet [23], Inception-ResNet-v2 [25], ResNet-50 [26] and the latest models such as Ayyub Manjramkar [4], Jiang et al. [8], and Tahir et al. [2] for comparison.
Fig. 5 shows the loss of these models. AlexNet and Inception-Res-v2 have unstable waveforms. With the same loss function, ResNet-50 and the improved ResNet-50 have a stable and regionally convergent performance, and the improved ResNet-50 had a better convergence effect. On the test dataset, the Gaussian function kernel with SVM was adopted in nonlinear classification, because the data is scattered, and it is difficult to achieve the desired effect. As shown in Fig. 6, k-means could not perform well on scattered data either. Fig. 7 shows the identification accuracy of different classic and latest state-of-art models. The improved ResNet-50 achieves the highest identification top-1 accuracy, which proves the ability of the proposed method.
Accuracy of deep learning models with training and validation: (a) AlexNet, (b) Inception-ResNet-v2, (c) ResNet-50, and (d) improved ResNet-50.
The loss of deep learning models: (a) AlexNet, (b) Inception-ResNet-v2, (c) ResNet-50, and (d) improved ResNet-50.
K-means algorithm on test data.
Accuracy of different algorithms on test dataset.
4.3.3 Results analysis
From the experimental results, it can be found that the improved deep residual network achieves a better performance than the classic models in apple leaf disease identification tasks, which proves that the proposed global and local combined residual connection architecture have its own interpretable advantages. The main reason is that the global residual connection with three maxpooling operations can maintain the mapping ability with the learning identity, and the local residual connection can keep the gradient in the training process. Therefore, the proposed method can achieve satisfactory identification accuracy. However, the added residual connections increase the complexity of the network, which improves the requirements of the computing resources.
5. Conclusion
Food security is one of the most urgent problems in the world, and humans struggle with generating plant diseases to improve grain production every year. The apple is one of the most popular fruits worldwide and needs to be protected to prevent disease intrusion. Apple diseases usually first appear on the leaves visually, which makes leaf disease identification more important in particular pathological diagnose. This paper analyzes the literature on apple disease identification and proposes an improved deep residual network with the local and global combined residual connection method. A global residual connection is added to the classic residual network, and the local residual connection architecture is optimized. Apple leaf diseases in the AI Challenger 2018 dataset are adopted for testing and comparison. Including 1,977 images with three categories, CedaRrust, Scab and Healthy, are introduced for training in this study. The proposed interpretable method achieves 98.74% accuracy on the test set, outperforming the existing models and proving the effectiveness of the proposed method. This study is carried out on apple diseases with more energy, and it is expected that there are an increasing number of models with better performances for apple and other plant disease identification designed to make a little contribution to food security.
Acknowledgement
This work was supported by 2021 project of the 14th Five Year Plan of Educational Science in Heilongjiang Province (No. GJB1421224 and GJB1421226), and the 2021 smart campus project of agricultural college branch of CAET (No. C21ZD02).