1. Introduction
Experts and researchers have made significant efforts to improve grain production. The accurate identification of crop diseases is a crucial research topic in the field of plant protection. With the development of information technology, especially with the application of artificial intelligence (AI) in the agricultural field, the efficiency and accuracy of plant disease identification have greatly improved. AI-based disease identification models are typically divided into three stages: data acquisition, data analysis and model selection, and data testing and model evaluation. The remainder of this paper is organized to follow these three stages.
In recent years, plant disease identification has been divided into two branches. One uses conventional machine learning algorithms, such as support vector machine (SVM), K-means, and random forest. Bhimte and Thool [1] proposed an SVM-based cotton blade disease recognition model in India, in which they first used image processing methods to select cotton disease features and then extracted features, including texture, color, and shape, via a data augmentation method. Finally, the features were integrated into an SVM classifier to identify cotton diseases. Padilla et al. [2] focused on sugarcane leaf disease identification. They hypothesized that the yellow spot of sugarcane can be detected by machine learning methods, and based on their analysis, they proposed an SVM-based mode that can classify infected yellow spot leaves. Sharma et al. [3] analyzed the main factors affecting India’s agricultural situation and proposed a plant leaf disease detection method. This approach, which could quickly classify leaf diseases and aimed to increase grain yield, consisted of four steps: image collection, image preprocessing, image segmentation, and image classification.
These methods have achieved advanced performance in crop disease recognition. However, these methods are based on traditional machine learning methods that require manually extracted features as inputs. This limitation restricts the comprehensiveness of feature extraction, particularly in extensive planting environments. The development of deep learning technology has mitigated this problem. Deep learning models can handle large-scale data, which can be expressed by the automatic learning characteristics of the given data [4]. In recent years, deep-learning models have played an increasingly important role in crop disease identification. Hasan et al. [5] focused on yellow mosaic disease identification using deep convolutional neural network (CNN) methods. They used a dataset containing 600 yellow mosaic leaf images and classified them into two groups: one containing training data and one that was used for testing. This method achieved 96% accuracy without using any feature extraction methods. Barbedo [6] consider deep learning to be an important tool that can provide good solutions to plant disease classification problems. They analyzed the main factors affecting the models and created a leaf image database that could be used for academic purposes. Ashok et al. [7] proposed a deep architecture approach for plant leaf disease detection tasks. They first employed an image segmentation and clustering method and inputted the extracted features into the proposed models. A tomato leaf disease dataset was adopted for training, and a reliable, safe, and accurate system was developed based on the proposed method. Zhou et al. [8] developed a tomato image disease recognition approach using a reconstructed deep residual-dense network that achieved a satisfactory performance.
Hence, conventional machine-learning algorithms have been widely used in plant leaf disease recognition. Compared to manual observation methods, machine learning and deep learning-based methods have significantly improved accuracy and efficiency. However, these technologies have some limitations:
· Lack of sufficient data for training. Limited training data leads to deep learning models that cannot effectively exert feature representation ability and often suffer from overfitting.
· Some existing models have insufficient generalization abilities. When transferring a new crop category from pre-trained models, it is difficult to achieve satisfactory performance.
Furthermore, the lack of training data sharply degrades the diagnostic performance in new environments [9]. To mitigate these limitations, this study proposes a crop disease identification model that uses transfer learning algorithms. Specifically, pre-trained deep learning models used on large-scale datasets, such as ImageNet, were implemented, and the pre-trained models were used to perform downstream tasks. The ImageNet dataset contains approximately 14,197,122 images with 21,841 categories; therefore, the models trained on ImageNet can provide satisfactory performance. However, in practical agricultural settings, training with extensive datasets for downstream tasks is not feasible. Some deep learning models have access to only a few hundred training images, making it challenging to achieve the desired results.
In this study, transfer learning was implemented to transfer parameters and weights to new models. After fine-tuning and retraining on a specific dataset, the proposed method exhibited a stronger feature representation ability for identifying crop diseases. The ImageNet dataset is similar to crop leaf images. Therefore, it is feasible to train crop leaf disease models with the help of pre-trained parameters and weights trained on large-scale datasets (such as ImageNet). The main contributions of this study are as follows.
· A transfer learning adaptation strategy was proposed as the stem stage for selecting pre-trained models.
· The transfer learning method and deep learning models were combined to compensate for a lack of training data.
· Hyperparameters and weights were fine-tuned and optimized using the proposed method to improve the generalization ability.
2. Materials and Methods
2.1 Data Acquisition
To improve the model generalization ability, a dataset includes 16,060 images with three categories of potato leaf disease images and nine categories of tomato disease images are collected. Some of the plant leaf disease images were downloaded from AI challenger and part of them were manual shooting, all the images were resized as the [TeX:] $$196 \times 196$$ pixels. There are 12 categories of leaf disease images and parts of leaf disease images are illustrated in Fig. 1.
(a–l) Part of leaf images with 12 categories of leaf disease.
2.2 Related Work and Proposed Method
In this work, the transfer learning method is combined with the improved deep learning models. Since the ImageNet dataset is similar to the dataset used in crop leaf disease identification, consequently, the ImageNet based pre-training deep learning models were employed and fine-tuned in this study.
2.2.1 Transfer learning
Deep learning models exhibit strong capabilities of feature representation, but they require a largescale dataset and long training time, which are often not met in most cases. Fortunately, the transfer learning approaches can make up these deficiencies to the greatest extent. Importantly, transfer learning is a particularly method for image classification in deep CNN context. The pre-trained models can learn parameters and weights from the original data and applied in new related fields to solve a new problem, furthermore, it can achieve a wonderful performance when retraining and fine-tuning a new classifier [10].
The requirement of transfer learning is that it must find the feature relationships between the target and source domains, and find the connection of the two domains with common features [11]. In addition, the different features of the target and source domains should be used to improve model architectures to complete the new task of the target domain and realize the transferred knowledge. Specifically, transfer learning can be expressed by formula (1):
where D denotes the source domain, χ is feature space, P(X) denotes marginal distribution, [TeX:] $$X=\left\{\chi_1, \chi_2, \ldots, \chi_n\right\} \in \chi.$$ The target task is shown in (2):
where y donotes label space [TeX:] $$f(\cdot).$$ denotes the target prediction function. The category and structure of transfer learning can be shown in Fig. 2.
The category and structure of transfer learning.
From Fig. 2 can be found that only when [TeX:] $$P_s \neq P_t$$ and [TeX:] $$P_s(y \mid x)=P_t(y \mid x),$$ the transfer learning can be adapted in the target domain. In crop leaf disease identification tasks, the marginal distribution of ImageNet and crop leaf diseases is different, which can be shown in (3):
Obviously, the feature space and label space are the same distribution between ImageNet and crop leaf diseases, which is shown in (4).
From (4), the dataset on crop leaf diseases and ImageNet are isomorphic, consequently, the transfer learning is appropriate in this study. From (3) and (4), the property of domain adaptation in deep learning models between ImageNet and crop leaf diseases can work well in theory.
2.2.2 Deep learning architectures
In this study, several deep learning approaches were introduced and fine-tuned with transfer learning approach, such AlexNet [12], VGG-16 [13], Inception-ResNet-v2 [14], ResNet [15], DenseNet-121 [16], and so on, this section briefly introduced these models.
AlexNet had proved its powerful ability in 2012 ImageNet Challenge [17]. As the first used deep learning model this year, AlexNet had achieved more than 80% improvement than the other models, it also won the COCO competition [18], and the 2014 ImageNet image location competition. This model replaced all convolution kernels with [TeX:] $$3 \times 3$$ and [TeX:] $$2 \times 2$$ pooling cores, which made it achieve a stronger ability. VGG is the first-choice algorithm to extract features from images, however, the disadvantage is that the number of parameters as much as 140M, so it needs more storage space. Incepetion-Res-v2, inherited and developed from VGG, adopted two [TeX:] $$3 \times 3$$ convolutions instead of [TeX:] $$5 \times 5$$, and incorporated batch normalization, leading to unparalleled results. ResNet emerged as the champion in the ILSVRC 2015 competition. The primary enhancement in ResNet was the introduction of residual blocks, and the use of residual connections resolved the issue of gradient degradation, elevating it to one of the most widely adopted deep networks. DenseNet won the best paper in CVPR 2017. It is improved by introducing a better way of shortcut named density connection, this method can make the ability more robust and get a faster convergence speed. In addition, DenseNet has fewer parameters and faster training speed than ResNet.
2.2.3 Proposed method
The proposed transfer learning-based approach in this study is outlined as follows. Firstly, a pre-trained CNN model with large-scale datasets such as ImageNet was employed. Subsequently a knowledge transfer process was executed on the pre-trained models, transferring the parameters and weights to a new training dataset. During this knowledge transfer, adjustments were made to the overall structure and original convolutional layers of the model based on relevant parameters, to align with the requirements of the new learning models.
In this work, a transfer learning adaptation strategy was proposed. Firstly, the stem stage was presented for selecting pre-trained models, the data distribution and conditional probability between target task and source domains need to be investigated. Only when the feature space and label space are isomorphic and the marginal distribution between the two data distribution is different can the domain adaptation of transformer learning works well. Secondly, when selected the target pre-trained model, a series of parameter fine-tuning to adapt the new tasks is necessary. In this work, the size of input images, number of label categories, selection of activation function and loss function, learning rate, optimization function were fine-tuned. Lastly, a novel fully connected layer was designed to address crop leaf diseases identification tasks. The pre-trained deep learning models on ImageNet were trained using crop leaf diseases datasets. After training and parameter fine-tuning until the models converged to the desired results, the test data was employed for final evaluation. To improve the generalization ability of the proposed models, a normalization layer was incorporated into each model. Additionally, dropout layers were introduced to mitigate overfitting, and fully connected layers were integrated into the models to equip them with recognition capabilities. The structure of improved deep transfer learning models can be demonstrated in Fig. 3.
The structure of improved deep transfer deep learning models.
3. Experimental Results and Analysis
3.1 Evaluation Metrics
A series of evaluation metrics such as accuracy, precision, recall and F1-score were adopted in this work. Accuracy represents the proportion of the entire sample correctly classified by the classifier, while precision denotes the proportion of the normal examples correctly identified among those classified as normal by the classifiers. Recall is the proportion of the predicted positive examples to the total positive examples, and the F1-score is utilized as a degree of assignment average of accuracy and recall.
3.2 Implementation Details
The HPC platform was the computing environment while the operating system was Cent OS 7.5 with [TeX:] $$8 \times$$ Intel 4216 CPU, 32G RAM, [TeX:] $$2 \times$$ 2080Ti GPUs, and the Python with TensorFlow framework was employed. The same dataset and experiment environment were implemented on the models of AlexNet, VGG-16, Inception-ResNet-v2, ResNet-50, and DenseNet-121, all of them were trained by transfer learning pre-trained methods without transfer learning for comparison. The dataset was divided into training/testing/validation by 0.6/0.2/0.2, respectively. All deep learning models utilized the categorical cross-entropy loss function with an initial learning rate of 0.0001 and were trained for 200 epochs.
3.3 Results Analysis and Discussion
The training and validation accuracy in this experiment are detailed in Fig. 4. It can be found that all models utilizing transfer learning outperform those trained without transfer learning, providing strong evidence for the effectiveness of transfer learning in enhancing identification capabilities. Table 1 shows the identification results on the test set. It can be demonstrated that whether on validation set or test set, the transfer learning models outperform those without transfer learning models. The maximum improvement exceeds 10% in accuracy for ResNet-50, with DenseNet-121 achieving the highest accuracy of 98%. This provides compelling evidence that inheriting hyperparameters and weights from pre-trained models enhances the feature representation capability in the target domain. Different from training models from scratch, the transfer learning mechanism learns prior knowledge shared from pre-trained models through three approaches: instances, features, and parameters. In this work, the correlation between the source and target domains was investigated based on the instance, feature, and parameter branches, and a novel transfer learning adaptation strategy was presented with the purpose of selecting the most suitable pretrained models for target datasets. The experimental results demonstrate that transfer learning technology significantly enhances the performance of the original models, and offers a solution for addressing limited samples in recognition tasks. Furthermore, an amazing harvest is that the transfer learning approach can greatly shorten training time, and improve training efficiency. As can be seen from Table 1, the VGG-16 and AlexNet with less training time than the other models, that is because these models have a simple model architecture, make them reduce many complex calculation processes. Together, the ResNet-50 architecture achieved a perfect balance between identification accuracy and efficiency.
Training and validation results of Inception-ResNet-v2, ResNet-50, and DenseNet-121 between transfer learning (left column) and without transfer learning (right column).
Experimental results of transfer learning and without transfer learning models
It is worth noting that the proposed method introduces a stem stage for model selection compared to training from scratch. This stage involves analyzing the data distribution between the source and target domains and may require additional time. Nevertheless, it does not diminish the advantages offered by the proposed approach.
5. Conclusion
Timely identifying crop diseases and taking corresponding measures to prevent crops from being intruded is good practice for plant protection. It is anticipated that the silicon-based approaches will be widely used in crop disease identification as artificial intelligence technologies play an important role in more and more fields. Due to the limited availability of training data for crop diseases, deep learningbased models for crop disease identification frequently encounter issues of overfitting and unsatisfactory performance. To address these pitfalls, a transfer learning adaptation strategy was proposed in this study, the target and source domains were analyzed in detail, and it was found that the problems can be solved in this study by adapting ImageNet for pre-training. Therefore, the deep transfer learning-based models with pre-trained on ImageNet were employed to solve the low crop leaf disease identification accuracy problem. Specifically, the parameters and weights were transferred from the pre-trained models, and a full connection layer was added to solve crop leaf disease identification tasks. In addition, a dropout layer was added to mitigate overfitting. To enhance the generalization capabilities of the proposed models, a normalization layer was introduced to shape the network architecture. Experimental results demonstrated that transfer learning-based approaches significantly improved the performance compared to training from scratch across 5 evaluation metrics. The presented method achieved satisfactory results and outperformed existing models, especially in the case of ResNet-50, where accuracy improved from 0.86 to 0.97 with the proposed method. Importantly, this model provides solutions for further control of plant diseases and preventing crops from disease threat. Nonetheless, the use of transfer learning does not come without some less conspicuous advantages, as evidenced by a slight decline in the recall of InceptionResNet-v2 from 0.83 to 0.82. This suggests that transfer learning may not comprehensively enhance performance. One possible reason could be incomplete adaptation of pre-trained data to the target dataset, a matter to be investigated in future work.