Guoqing Xu* and Shouxiang Zhang**Fast Leaf Recognition and Retrieval Using Multi-Scale Angular Description MethodAbstract: Recognizing plant species based on leaf images is challenging because of the large inter-class variation and inter-class similarities among different plant species. The effective extraction of leaf descriptors constitutes the most important problem in plant leaf recognition. In this paper, a multi-scale angular description method is proposed for fast and accurate leaf recognition and retrieval tasks. The proposed method uses a novel scale-generation rule to develop an angular description of leaf contours. It is parameter-free and can capture leaf features from coarse to fine at multiple scales. A fast Fourier transform is used to make the descriptor compact and is effective in matching samples. Both support vector machine and k-nearest neighbors are used to classify leaves. Leaf recognition and retrieval experiments were conducted on three challenging datasets, namely Swedish leaf, Flavia leaf, and ImageCLEF2012 leaf. The results are evaluated with the widely used standard metrics and compared with several state-of-the-art methods. The results and comparisons show that the proposed method not only requires a low computational time, but also achieves good recognition and retrieval accuracies on challenging datasets. Keywords: Image Retrieval , k-NN , Leaf Recognition , Multi-Scale , SVM 1. IntroductionPlants play an essential role in every aspect of human life, providing oxygen, fresh food, fuel, and medicine for humans [1]. Medicinal herbs have been used to prevent and cure diseases in China for a long time. Generally, plants are easily recognized by botanists who have many years of work experience. However, this is very challenging for common people because there are numerous diverse plant species in the world, including many new and rare species [2]. Therefore, it is a necessary and useful practice to implement plant identification using simple and reliable methods, enabling people who lack the requisite specialized knowledge and experience to also perform plant identification [3]. With the development of imaging-related technologies, automatic plant species identification has attracted broad attention from researchers in the fields of computer vision and plant taxonomy [4]. effective feature extraction is the most challenging, and it can affect the performance of subsequent classifier training [5]. Therefore, a considerable amount of effort goes into the design of feature descriptors. Moreover, the morphological characteristics of leaves are instrumental to plant species identification. Hence, different leaf feature extraction methods have been presented in recent years to perform plant species identification using leaf images [2,6]. The most commonly used morphological characteristic of leaves is the shape because the leaf shape is more stable than the leaf color and its veins [7]. More importantly, it has distinctive information that is valuable for distinguishing among different plant species. Existing studies on shape-based leaf identi¬fication using only shape features are reviewed below. In earlier times, simple leaf shape features were used, including length, width, aspect ratio, and leaf diameter, which are contour-based descriptors. Simple region-based descriptors include solidity, area, rectangularity, and eccentricity. However, these methods are not suitable for irregular complex curves. In recent times, increasingly sophisticated descriptors have been introduced to identify plant leaves, such as curvature scale space (CSS), inner-distance shape context (IDSC), and multi-scale distance matrix (MDM). IDSC is an extended version of the shape context (SC), which replaces the Euclidean distance with the inner-distance in SC. IDSC can achieve high recognition accuracy on two leaf datasets, but it takes a lot of time for leaf matching due to the use of dynamic programming. A manifold learning method called supervised global-locality preserving projection (SGLP) was presented for leaf recognition [1]. The SGLP method uses a global weighted inter-scatter matrix to describe the intrinsic manifold structure for classification. Xu et al. [8] proposed an included-angular ternary pattern (IATP) for shape-based image retrieval. IATP is a promising multi-scale shape descriptor that achieves better leaf retrieval performance on the well-known Swedish leaf dataset. Recently, a method called triangle-distance representation (TDR) was also presented for plant leaf recognition [9]. TDR achieves very high leaf recognition accuracy on Swedish leaf, Smithsonian leaf, Flavia leaf, and ImageCLEF2012 leaf datasets. However, TDR is an integrated method where two types of shape features are fused (see [7] for more about plant leaf recognition methods using images). In this paper, we proposed a multi-scale angular description method that uses angular information to describe the leaf shape. This method uses a novel scale-generation rule to determine the scale parameters of the angular description. Three challenging leaf datasets are used to test the performance of the method. 2. Proposed Multi-Scale Angular Description MethodIn this section, a multi-scale angular description method using a novel scale-generation rule is proposed for leaf feature description and recognition. The multi-scale angular description is developed from the leaf contour, which is obtained by converting original color leaf images into binary images, and then tracing the leaf shape in a clockwise direction. A sampled leaf L is expressed by N points [TeX:] $$P_{i}=\left(x_{i}, y_{i}\right)(i=1,2, \ldots, N),$$ where [TeX:] $$x_{i} \text { and } y_{i}$$ are the x-coordinate and the y-coordinate of [TeX:] $$P_{i}.$$ 2.1 Proposed Scale-Generation RuleThe proposed scale-generation rule is developed using the golden ratio (GR) to find paired neighbor points to derive the angular description of leaves. There are three steps involved as follows: Step 1. For each [TeX:] $$P_{i}$$ on leaf contour L, the contour is first trisected with the requirement of taking [TeX:] $$P_{i}$$ as the starting point. Thus, two new trisected points are found, which can constitute the first pair of neighbor points. This is the first scale for [TeX:] $$P_{i}.$$ Step 2. Taking [TeX:] $$P_{i}$$ as the starting point and each of the newly founded neighbor points as endpoints, there will be two leaf segments correspondingly, which can be expressed as [TeX:] $$P_{i} P_{r} \cdot \overline{P_{i} P_{l}}$$. Then, for the two segments, there are two points on the segments satisfying that
(1)[TeX:] $$\frac{\left|\overline{P_{i} P_{r}}\right|}{\left|\overline{P_{i} P_{r n}}\right|}=\mathrm{GR}$$and
(2)[TeX:] $$\frac{\left|\overline{P_{i} P_{l}}\right|}{\mid \overline{P_{i} P_{\text {ln}}}}=\mathrm{GR}$$where is the length of the leaf segment, and [TeX:] $$P_{\ln } \text { and } P_{r n}$$ are contour points. Then, two new points are found, and they can constitute a new pair of neighbor points on the next scale. Step 3. Step two will be iterated if the length ratio of the newest leaf segment to L is greater than 1:100, or else there will be no new points, and a new scale will have to be generated. It can be seen from the above three steps that a certain number of paired points are found for each leaf contour point, and the number is equal to the scale number, which is constant even though the contour point number varies. Specifically, the three steps generate only seven scales for each contour point, and there are seven paired neighbor points under the seven scales correspondingly. Fig. 1 demonstrates seven paired neighbor points for [TeX:] $$P_{i}$$ on a sampled leaf contour, where each pair of points is marked using different markers. 2.2 The Multi-Scale Angular DescriptionFor each point on the leaf contour, angle information under the seven scales can be calculated using the seven paired neighbor points. Assuming that under scale [TeX:] $$S(S=1, \ldots, 7), P_{i r}^{S} \text { and } P_{i l}^{S}$$ are the paired neighbor points for [TeX:] $$P_{i},$$ [TeX:] $$\theta_{i}^{S}$$ calculated using the following formula:
(3)[TeX:] $$\theta_{i}^{S}=a \cos \frac{\overrightarrow{P_{l} P_{l r}^{S}} \cdot \overrightarrow {P_{l} P_{l l}^{S}}}{\left|\overrightarrow{P_{l} P_{l r}^{S}}\right| *\left|\overrightarrow{P_{l} P_{l l}^{S}}\right|}$$where [TeX:] $$\overrightarrow{P_{l} P_{l r}^{S}} \text { and } \overrightarrow{P_{l} P_{l l}^{S}}$$ are vectors formed by, [TeX:] $$P_{i}, P_{i r}^{S} \text { and } P_{i l}^{S}.$$ Because there are seven angles under seven scales for each leaf contour point, the multi-scale angular description of leaf L is defined as follows:
(4)[TeX:] $$\theta=\left(\begin{array}{lll} \theta_{1} & \cdots & \theta_{N} \end{array}\right)^{T}=\left(\begin{array}{ccc} \theta_{1}^{1} & \cdots & \theta_{1}^{7} \\ \vdots & \ddots & \vdots \\ \theta_{N}^{1} & \cdots & \theta_{N}^{7} \end{array}\right)$$Fig. 2 depicts two colored leaves, their binary versions, and corresponding contours, which belong to the same plant species from the Flavia dataset. Fig. 3 shows the multi-scale angular description of leaves in Fig. 2, where Fig. 2(a) and 2(d) are indicated using a blue solid line and a red dotted line, respectively. As shown in Fig. 3, angular descriptions under scales 1–2 change slowly, and most of the descriptions of the two leaves are almost the same, which reflect the global information of the two leaves. The differences between angular descriptions of the two leaves under scales 3–4 gradually increases. As for scales 5–7, the curves of angular descriptions have dramatic changes, and few of them overlap, showing prominent local variations between the two leaves. In summary, multi-scale angular descriptions can capture the discriminative features of leaves. 2.3 Leaf MatchingExisting similarity measurements include city block, Euclidean distance, and dynamic programming (DP) algorithms, and these algorithms can be used to match the multi-scale angular description. However, DP is too time-consuming to match leaves efficiently in large-scale leaf datasets. For city block or Euclidean distance, the similarity between the multi-scale angular descriptions will be different if leaf contours have different starting points. Therefore, the fast Fourier transform is used to deal with this problem, which is shown below:
Coefficients [TeX:] $$F_{t S}$$ are obtained for scales S=1, …, 7. To make the final descriptor compact, only magnitudes of [TeX:] $$F_{t S}$$ with small t are used, that is,
(6)[TeX:] $$D=\left[\begin{array}{ccc} \left|F_{11}\right| & \cdots & \left|F_{17}\right| \\ \vdots & \ddots & \vdots \\ \left|F_{M 1}\right| & \cdots & \left|F_{M 7}\right| \end{array}\right]$$where [TeX:] $$\mathrm{M} \subseteq \mathrm{N} \text { and }|\cdot|$$ is the magnitude of the coefficients. The multi-scale angle descriptor can be expressed in vector form as follows:
(7)[TeX:] $$\begin{array}{llll} D=\| F_{11} \mid & \cdots & \left|F_{15} 7\right| \cdots & \cdots\left|F_{M 1}\right| & \cdots & \left|F_{M 7}\right| \mid \end{array}$$Given two leaves, A and B, their multi-scale angle descriptor is represented by [TeX:] $$F_{m s}^{A} \text { and } F_{m s}^{B}$$ respectively. The dissimilarity between A and B can be calculated using the [TeX:] $$\mathrm{L}_{1}$$ distance, which is given by
(8)[TeX:] $$\operatorname{Dist}(A, B)=\sum_{s=1}^{7} \sum_{m=1}^{M}\left|F_{m S}^{A}-F_{m S}^{B}\right|$$A small distance indicates the two leaves are similar in species. 2.4 Leaf ClassifiersResearchers have proposed many classifiers to recognize objects in images, such as the support vector machine (SVM), k-nearest neighbor (k-NN), and artificial neural network (ANN). Here, both SVM and k-NN are used to classify leaves using their multi-scale angle descriptors. Leaves in each image dataset are divided into training and testing subsets. k-NN with k=1 (that is, 1-NN) is used without the training process. An SVM model is trained using leaves in the training subset and then utilized to classify leaves in the testing subset. 3. Recognition and Retrieval Performances and ComparisonsThe leaf recognition and retrieval performances of the proposed method were tested on a desk computer with an Intel i5-8400 CPU and MATLAB software. The dimension of the multi-scale angular descriptor is 49 with M=7 for each of the seven scales. The SVM model is trained with parameter c=49.69 and g=32.66 for recognition. Three challenging datasets were used, including Swedish, Flavia, and ImageCLEF2012 leaf datasets. This paper compares the proposed method with state-of-the-art methods using standard metrics, such as recognition accuracy, mean average precision (MAP), and recognition accuracy versus top N images. It is worth noting that recognition in botany is same with classification in computer vision. 3.1 Performances on Swedish leaf datasetThe Swedish leaf database contains 1,125 leaf images from 15 tree species with 75 leaf images for each species. Fig. 4 shows representative leaf samples from all species. This dataset is very suitable for testing classification and retrieval performance because of its large inter-class variation and inter-class similarities among different species. The Swedish leaf dataset was first used to test the leaf recognition performance of the proposed descriptor. To make a convenient comparison, the recognition accuracy is utilized as an evaluation metric. Twenty-five leaves were randomly selected from each tree species to form a training subset, and the remaining 50 leaves from each tree species form the testing subset. All 750 testing leaves were classified 10 times using SVM and 1-NN, respectively. The average accuracies of the proposed method are compared with those of multiple state-of-the-art methods, including the well-known IDSC, MDM, VGG16 [9], and AlexNet [9] as well as seven other notable methods. The accuracies of the compared methods are from the literatures [1,9,10]. Table 1 shows that the proposed method using SVM can achieve the highest recognition accuracy of 97.11% among these notable methods. It is 2.98% more than IDSC and 8.99% higher than SC, which utilizes time-consuming DP matching methods. It is 1.36% more than AlexNet and 1.27% higher than VGG16, these are well-known deep learning methods. The proposed method also has good recognition accuracy using 1-NN as a classifier. This demonstrates the effectiveness of the proposed method. We also tested the performance of the proposed method with fewer dimensions. The proposed scale-generation rule is stopped at scales 6, 5, 4, and 3 separately, and we also kept the same number of features for each scale. The recognition accuracies of the proposed method using 1-NN on the Swedish leaf dataset are shown in Table 2. As can be seen in Table 2, the multi-scale angular description with 7×7 achieved the highest recognition accuracy. Therefore, this setting is kept as follows. The Swedish leaf dataset is then used to test the computational efficiency of the proposed descriptor to make a comprehensive comparison. The elapsed time of the proposed method and IDSC are recorded and averaged for feature extraction and matching, which are shown in Table 3. Table 3 shows that the proposed method takes only 9.56% of time taken by IDSC for feature extraction, and only 0.12% time taken for matching. Hence, the proposed method has very high computational efficiency. Table 1.
Table 2.
Table 3.
3.2 Performances on Flavia leaf datasetThe Flavia leaf dataset contains 1,907 leaf images from 32 plant species with 50–77 leaf images for each species. It is very challenging for recognition tasks and used in many plant leaf recognition and retrieval tasks. Leaves from each of the 32 plant species are shown in Fig. 5. The Flavia leaf dataset is first used to test the leaf recognition performance of the multi-scale angular descriptor. To compare with state-of-the-art methods directly, the same criterion in [6] is used. For each plant species, 40 leaf images were chosen randomly as training images, and ten leaf images were chosen randomly from the rest as testing images. All 320 testing images were classified using SVM and 1-NN, respectively. These steps were repeated ten times. The average accuracies of the proposed method are compared with seven notable methods, as shown in Table 4. The accuracies of the compared methods are from [5,6]. Table 4.
Table 4 shows that the proposed method achieves the highest recognition accuracy with 93.25% using SVM. It is 5.75% higher than the well-known SIFT, 5.33% higher than Deep CNN, a deep learning method. The proposed method using 1-NN also achieves a very high accuracy with 88.13%. The Flavia leaf dataset is then used to test the leaf retrieval performance of the multi-scale angle descriptor. Each of the 1,907 leaves is taken as a query once, and the MAP value of the proposed method is shown in Table 5. The MAP values of the compared methods are illustrated in [6]. It can be seen from Table 5 that the proposed method obtained higher MAP values than most of the compared methods, and only second to shape texts, which utilizes a time-consuming matching algorithm. Table 5.
3.3 Performances on ImageCLEF2012 leaf datasetImageCLEF2012 leaf dataset is a famous and challenging dataset for plant leaf recognition, and contains “Scan”, “pseudo-scan”, and “Photograph” subsets. The “Scan” subset is the most wildly used, and accounts for 57% of ImageCLEF2012 leaf dataset. There are 6,630 leaf images from 115 species. The number of testing leaves is 1,760. The testing leaves belong to 105 species from 10 users. Repre¬sentative testing leaves are shown in Fig. 6. To make a convenient comparison, the recognition accuracy versus the top N leaf metric is used to evaluate the performance of the proposed ImageCLEF2012 dataset. In this metric, each leaf in the testing subset was taken as a query, and the distances between the query and all leaves in the training subset were calculated and sorted in ascending order. If there is at least one leaf belonging to the same species with the query in the top N leaves, the query was regarded to be recognized successfully. The recognition results of 1,760 leaves were recorded corresponding to the parameter N with N=1, …., 20. Fig. 7 shows the recognition accuracies of the proposed method and seven other notable methods for comparison, whose results are from the literature [9]. As can be seen from Fig. 7, in most cases, the proposed method achieves high recognition accuracy. When [TeX:] $$\mathrm{N} \leq 10,$$ the recognition accuracy of the proposed method is the second highest, just below that of the TDR method. The proposed method outperforms deep learning methods, including AlexNet and VGG16. It is also better than IDSC and SC, which use DP for matching. When [TeX:] $$\mathrm{N}>10,$$ the recognition accuracy of the proposed method is second but very close to that of the TDR method. To make a more comprehensive comparison with TDR, we re-implemented the feature extraction stage of the TDR method to compare the computational efficiency of TDR and our methods. The feature extraction stage was run ten times on the same computer. The average computational time for the feature extraction stage of the TDR method is 5.31 ms, whereas our method only takes 1.87 ms for the feature extraction stage. Compared with the TDR method, our method has obvious advantages in computing efficiency. The reasons are two-fold. First, TDR is an integrated method where both the triangular centroid distance (TCD) and the sign of TCD are fused for each scale. It is time-consuming to calculate the sign of TCD. Second, the number of scales of TDR is log2(512/2) = 8, whereas there are only seven scales in our method. Overall, the accuracy of our method is slightly reduced, while the computational efficiency is obviously improved. Finally, all three datasets are combined to test and generalize the proposed method. The entire leaf dataset contained 9,662 leaf images. Each of the 9,662 leaf images is taken as one query image, and the MAP of the proposed method for the entire dataset is 41.72%. We also tested the unified method [6] on the entire dataset, and the MAP of the unified method was 39.76%. These results indicate that the proposed method is very effective. 4. ConclusionIn this paper, we proposed a novel multi-scale angular description method for fast plant leaf recognition and retrieval. In the proposed method, a scale-generation rule using the GR was used to find paired neighbor points to derive the angular description of leaves. The multi-scale angular descriptor is compact and efficient. Plant leaf recognition and retrieval experiments and comparisons with state-of-the-art methods on three challenging datasets show that the proposed method is computationally efficient and has promising performances. Vein is a distinctive feature of leaves, and in the future, we will design a new vein description method to distinguish leaves whose shapes are similar. BiographyGuoqing Xuhttps://orcid.org/0000-0002-0405-7691He received the Ph.D. in Control Science and Control Engineering from the University of Science and Technology, Beijing, China, in 2014. He is a lecturer in School of Information Engineering, Nanyang Institute of Technology. His research interests include content-based image retrieval, automatic image annotation, machine learning, and pattern recognition. BiographyShouxiang Zhanghttps://orcid.org/0000-0001-7828-3977He received the Ph.D. in Control Science and Control Engineering from China University of Mining and Technology, Beijing, China, in 2006. He is a professor in School of Information and Electronic Engineering, Shandong Technology and Business University. He has long been engaged in research and development as well as the teaching of electronic information equipment and technology in the direction of intelligent coal mining and is proficient in signal processing, programmable devices, fieldbus networks, software development, artificial intelligence, and so on. References
|