Article Information
Corresponding Author: Nam-Chul Kim** , nckim@knu.ac.kr
Corresponding Author: Sung-Ho Kim* , shkim@knu.ac.kr
Hee-Hyung Bu*, School of Computer Science and Engineering, Kyungpook National University, Daegu, Korea, hhbu@knu.ac.kr
Nam-Chul Kim**, School of Electronic Engineering, Kyungpook National University, Daegu, Korea, nckim@knu.ac.kr
Byoung-Ju Yun**, School of Electronic Engineering, Kyungpook National University, Daegu, Korea, bjisyun@ee.knu.ac.kr
Sung-Ho Kim*, School of Electronic Engineering, Kyungpook National University, Daegu, Korea, shkim@knu.ac.kr
Received: October 2 2018
Revision received: February 22 2019
Accepted: March 8 2019
Published (Print): August 31 2020
Published (Electronic): August 31 2020
1. Introduction
Recently, content-based image retrieval (CBIR) systems are being developed by global IT companies. Google Search engine supports CBIR. The search engine is weak with rotationand scale-variant images; it cannot even retrieve complex rotated images. Bixby of Samsung Galaxy S8 also supports CBIR for pictures. When retrieving complex rotated images on a cellphone, the retrieved images are seldom similar.
The existing methods usually extract features related to texture and color information because it is considered an essential function of the human visual system for object recognition. The research on texture has usually been conducted based on the frequency domain. Because the spatial frequency domain of an image represents the rate of change of pixel values, the variant condition of edges can be determined. In particular, because high and low frequencies can be separated, high frequency components including most edge variants are employed in many research areas. Representatively, frequency domains include Gabor transformation [1], wavelet transformation [2], Fourier transformation [3], etc.
Research on color, including color histogram [4] and color autocorrelogram [5], has been conducted. Color histogram is popular as it is based on statistics; it does not consider local relations, it is measured in global areas, and it has advantages of rotation and scale-invariance. It has been adapted in many studies because of its simplicity and wide applicability.
As a method that employs color distance, color autocorrelogram adds distance information to the color histogram. Recently, the major goal of image retrieval studies has been rotation and scale-invariance for rotated and scaled variants of images. Examples of such methods are rotationand scale-invariant Gabor features for texture image retrieval proposed by Han and Ma [6]; color autocorrelogram and block difference of inverse probabilities-block variation of local correlation coefficient in wavelet domain proposed by Chun et al. [7]; texture feature extraction method for rotationand scale-invariant image retrieval proposed by Rahman et al. [8]; rotation-invariant textural feature extraction for image retrieval using eigenvalue analysis of intensity gradients and multi-resolution analysis proposed by Gupta et al.[9]; rotation-invariant texture retrieval considering the scale dependence of Gabor wavelet proposed by Li et al. [10]; and CBIR using combined color and texture features extracted by multi-resolution multidirection (MRMD) filtering proposed Bu et al. [11]. However, among the latest retrieval methods, some methods are not rotation-invariant and show good performance on databases composed of photographs taken by photographers on the ground. One of them is the retrieval method using multichannel decoded local binary pattern (LBP) proposed by Dubey et al. [12].
In image retrieval, a number of aspects need to be considered. However, in this paper, we consider two: (1) features should have less amount of redundant information and (2) dimensions of feature vectors should not be very large. The color features employed in this paper are extracted from autocorrelograms [5] by using the distance of colors in chrominance space including hue (H) and saturation (S) color components. The used texture features are extracted from complete local binary pattern (CLBP) based on MRMD filtering [11] in luminance space of value (V) color component. The MRMD filters allow easy extraction of rotation-invariant features. CLBP [13] is generalized from LBP proposed by Ojala et al. [14]; it yields more texture information than LBP. Employing HSV color space is more efficient for image retrieval than RGB color space because texture information is contained in [TeX:] $$V$$ component of luminance, and color information is contained in H and S components of chrominance.
This paper combines the CLBP texture features based on MRMD filtering and the color autocorrelogram features. As such, there is an advantage of high retrieval performance; in addition, the feature dimension is not too large. Moreover, the amount of redundant information is less because of the use of HSV color space to separate the luminance of the [TeX:] $$V$$ component for texture feature and the chrominance of the H and S components for color feature. Furthermore, the proposed CLBP is scale-based, unlike the conventional CLBP, which is distance-based.
The proposed method is explained in Section 2. Experiment and results are discussed in Section 3. Finally, the conclusion is presented in Section 4.
2. The Proposed Texture and Color Features Extraction
In this paper, CBIR system using CLBP based on MRMD filtering and color autocorrelogram is proposed.
The CLBP based on MRMD filtering is describes in the following Section 2.1. Color autocorrelogram is the same as the method proposed in our previous paper [15]. Fig. 1 shows the block diagram of the proposed image retrieval system.
Block diagram of the proposed CBIR system.
2.1 Texture Feature Extraction Using MRMD Filtering-Based CLBP
2.1.1 CLBP_S feature extraction in MRMD filtering
Texture Features are extracted using CLBP based on MRMD filtering in [TeX:] $$V$$ component. CLBP_S corresponds to RULBP [4]. The CLBP_S feature extraction on MRMD filtered domain includes four steps as follows:
Step 1: Converts the RGB query image / to HSV image for the [TeX:] $$V \text { component image } I_{V}$$.
Step 2: Conducts MRMD high pass filtering [11] with directions in a resolution [TeX:] $$r$$ in the image [TeX:] $$I_{V}$$ to get the filtered images [TeX:] $$y_{r, \theta}$$. The resolution levels are one of [TeX:] $$r \in\{1,2, \ldots, M\}$$, where [TeX:] $$M$$ is the number of resolution levels and the directions are expressed as [TeX:] $$\theta=(2 \cdot \pi \cdot n) / N$$ where [TeX:] $$n \in \{0,1,2, \ldots, N-1\} \text { and } N$$ is the number of directions.
Step 3: Creates binary image for the pixel values from the filtered images. The outcome is LBP of the image [TeX:] $$I_{V}$$. LBP based on MRMD filtering can be expressed as in the following:
where, [TeX:] $$s(x)=\left\{\begin{array}{l} 1, x \geq 0 \\ 0, x<0 \end{array}\right\}, \theta_{n}$$ refers to the [TeX:] $$n$$-th direction and [TeX:] $$p$$ is pixel position.
Step 4: Normalizes RULBP histogram. The RULBP has N + 2 bins at a resolution level and the total number of directions N. The RULBP on the MRMD filtered domain is expressed as in the following:
where [TeX:] $$U\left(L B P_{N, 2^{r-1}}(p)\right)=\sum_{n=0}^{N-1}\left|s\left(y_{r, \theta_{n}}(p)\right)-s\left(y_{r, \theta_{n-1}}(p)\right)\right|$$ and it refers to the sum of bit changes in LBP.
The normalized RULBP histogram is expressed as in the following:
where [TeX:] $$i \in\{0,1,2, \ldots, N, N+1\},|P|$$ stands for the size of P or the size of image, refers to the Kronecker delta.
As an outcome, the extracted total feature dimension is M × (N + 2).
2.1.2 CLBP_M feature extraction in MRMD filtering
CLBP_M stands for RULBP of magnitude images. The CLBP_M feature extraction on MRMD filtered domain includes four steps as in the following:
Step 1–2: Steps 1 and 2 are the same as in the RULBP procedure.
Step 3: Evaluates the average [TeX:] $$\mu_{r}$$ of absolute values of the same pixel position with directions in a resolution [TeX:] $$r$$ in the filtered images. Then, compare each absolute value and the average [TeX:] $$\mu_{r}$$. The result comes out as CLBP_M of the image [TeX:] $$I_{V}$$. The CLBP_M (MLBP) on MRMD filtered domain is expressed as in the following:
Step 4: Creates and normalize RUMLBP histogram with CLBP_M. The RUMLBP has N + 2 bins in a level and the total number of directions N. The RUMLBP on the MRMD filtered domain is expressed as in the following:
The normalized RUMLBP histogram with CLBP_M is the same as in Eq. (3).
As an outcome, the extracted total feature dimension is M × (N + 2).
2.1.3 CLBP_C feature extraction in MRMD filtering
CLBP_C [13] is a feature related to a center, but in this paper, CLBP_C is related to the value averaged over all directions instead of the center. The CLBP_C operator gives a histogram as the result of the comparison between each value averaged over all the directions and the global average. Two bins are used per resolution level. Thus, the total feature dimension is [TeX:] $$2 M$$. The CLBP_C feature extraction on the MRMD filtered domain includes four steps as in the following:
Step 1–2: Steps 1 and 2 are the same as the RULBP procedure.
Step 3: Creates the average image [TeX:] $$I_{\mu, r}$$ by computing the average of the same pixel positions for the absolute values of filtered images with directions at a resolution [TeX:] $$r$$. Then, evaluate [TeX:] $$\mu\left(I_{\mu, r}\right)$$, which is the global average of the image [TeX:] $$\boldsymbol{I}_{\mu, r}$$. Compare each average value of [TeX:] $$\boldsymbol{I}_{\mu, r}$$ and global average value [TeX:] $$\mu\left(I_{\mu, r}\right)$$. The CLBP_C on the MRMD filtered domain is expressed as in the following:
Step 3: Creates and Normalizes CLBP_C histogram is expressed as follows:
where [TeX:] $$i \in\{0,1\},|P|$$ stands for the size of [TeX:] $$P$$ or the size of the image and stands for the Kronecker delta.
As an outcome, the extracted total feature dimension is [TeX:] $$2 M$$.
3. Experiment and Results
The experiment is conducted in two groups for 6 databases—Corel [16] and VisTex [17]; Corel_MR and VisTex_MR with scale-variant images, and Corel_MD and VisTex_MD with rotation-variant images. First is the comparison of superiority between the methods of partial features employed in the proposed method and the entire proposed method in this paper. Second is the comparison of the CLBP method and the color autocorrelogram method that uses R, G, and B components with the proposed method that use H, S, and V components.
The measurement of similarity for the comparison is given by Mahalanobis distance [18] where each of the same components is normalized by their standard deviation. The performance of image retrieval is evaluated as precision and recall [19]. The precision is computed as the percentage of relevant images among retrieved images for a query image. The recall is computed as the percentage of relevant images retrieved over the total relevant images for a query image. In this experiment, the proposed method has 152 dimensions—CLBP_S(40), CLBP_M(40), CLBP_C(8), and color autocorrelogram(64)—as shown in Table 1.
Color spaces and dimensions of retrieval methods used in the experiments
Fig. 2 shows the precision versus recall for comparing the partial features of the proposed method with the proposed method, for the 6 databases. Fig. 3 shows the precision versus recall for comparing the methods using R, G, and B components with the proposed methods, for the 6 databases.
The precision versus recall for comparing the separate methods employed in the proposed method with the proposed method for 6 databases: (a) Corel, (b) VisTex, (c) Corel_MR, (d) VisTex_MR, (e) Corel_MD, and (f) VisTex_MD.
The precision versus recall for comparing the existing CLBP method and color autocorrelogram method using R, G and B components with the proposed method for 6 databases: (a) Corel, (b) VisTex, (c) Corel_MR, (d) VisTex_MR, (e) Corel_MD, and (f) VisTex_MD.
The average gains of the proposed method over the methods of partial features are also investigated. In the first experiment, the average gains are 26.5% and 14.17% in Corel and 31.15% and 20.97% in VisTex; 24.75% and 12.56% in Corel_MR and 33.4% and 21.91% in VisTex_MR; 24.4% and 12.11% in Corel_MD and 35.45% and 23.13% in VisTex_MD, respectively.
In the second experiment, the average gains of the proposed method over the methods using R, G and B components are 22.96% and 12.88% in Corel and 9.3% and 6.96% in VisTex; 18.25% and 9.45% in Corel_MR and 11.01% and 7.51% in VisTex_MR; and 18.15% and 9.44% in Corel_MD and 15.16% and 10.42% in VisTex_MD, respectively. As a result, the proposed method is superior to the methods using partial features employed in the proposed method and the CLBP and color autocorrelogram methods using R, G, and B components.
Additionally, we compare the retrieval performance of the proposed method to that of the multichannel decoded LBP on Corel-1K database under the same condition of [12]. The proposed method shows the precision of 78.3%, which is 3.1% higher than that of the latter (74.93%).
4. Conclusion
In this paper, the combined method of CLBP based on MRMD and color autocorrelogram is proposed. CLBP features are extracted in an MRMD filtered domain of the V component. Color autocorrelogram features are extracted in two dimensions of H and S components. In the experiments, the proposed method is compared with the separate methods employed in the proposed method, the CLBP method, the color autocorrelogram method using R, G, and B components, and the multichannel decoded LBP method. As a result, the proposed method outperforms the three conventional methods. Our future research will include inventing a scale-invariant feature extraction method efficient for various scale-variant images.
Acknowledgement
This study was supported by the BK21 Plus project (SW Human Resource Development Program for Supporting Smart Life) funded by the Ministry of Education, School of Computer Science and Engineering, Kyungpook National University, Korea (No. 21A20131600005).