1. Introduction
Gait recognition system usually includes motion detection, gait cycle detection, feature extraction and pattern recognition, in which feature extraction and pattern recognition are most important basic operations. As a result, to improve the recognition performance of such systems, researchers had to pay more attention to these two operations and had proposed various methods. Currently, gait recognition is based on some gait datasets, such as CASIA gait dataset, USF gait dataset, which contain several gait sequences of selected subjects, in which a subject is the person whose gait data is sampled and a gait sequence refers to the video of some subject. Gait feature extraction methods can be classified into model-based and model-free approaches [ 1 ]. In the model-based approaches, gait is represented by structural model that describes the shape of human body parts or by motion model that describes the motion of each body part. Although model-based methods have many advantages, but because it is hard to extract the implicit model from the gait sequences, their performance is limited and computational complexity is relatively high [ 2 ]. The another kind of methods are model-free approaches [ 3 ] which typically analyze the gait sequence by motion that subjects make during walking to extract gait features for recognition. Compared to the model-based methods, the model-free methods demonstrated a better performance with lower computational complexity on most gait databases.
The average silhouette over one gait cycle, known as GEI is widely used in recent model-free gait recognition algorithms because of its simplicity and effectiveness [ 4 ]. In GEI approach, the real and synthetic gait templates are generated to overcome the limitation of training templates and to improve the accuracy of gait recognition. Compared to the other methods, the averaging operation makes GEI less sensitive to segmentation errors [ 4 ]. Since GEI is significantly affected by the silhouette quality, the method in [ 5 ] improved the silhouette quality and recognition accuracy by using standard gait models as prior knowledge. Zhang et al. [ 6 ] proposed an active energy image (AEI) + two-dimensional local preserving projection (2DLPP) method by accumulating image difference between subsequent silhouette images, which first extract the active regions by calculating the difference of two adjacent silhouette images, and constructed an AEI by accumulating these active regions. The AEI+GEI method [ 7 ] can make the best of complementary of dynamic information and static information and combine them for gait recognition.
In gait recognition, since the linear methods fail to perform very well with the influence of nonlinear factors such as illumination, view and walk speed, the better solutions could be achieved by nonlinear methods such as kernel-based methods. Principal Component Analysis (PCA) is a classical method for dimensional reduction and feature extraction. KPCA is another generalization of PCA, in which the kernel trick is employed first to project the input image data into a high-dimensional feature space F, and then the standard PCA is performed in F [ 8 ]. Yang and Qiu [ 9 ] integrated KPCA and SVM methods to improve the classification of gait patterns and increased the recognition rate and Fazli et al. [ 10 ] utilized linear discriminant analysis (LDA) for feature reduction and SVM classification technique as an optimal discriminant method. Inspired by sparse representation (SR), Qiao et al. [ 11 ] proposed a sparsity preserving projection (SPP) method for face recognition and Wang et al. [ 12 ] proposed a kernel sparsity preserving projection (KSPP) method for gait recognition. A spatio-temporal gait recognition method based on radon transform was proposed in [ 13 ], in which the gait contour images are decomposed into two temporal templates and these two templates were subjected to Radon transform for feature extraction. As for matrix-based subspace analysis, two-dimensional PCA (2DPCA) is proposed [ 14 ], which can achieve better performance than PCA in recognition when the number of samples is small.
Due to the robustness of Gabor features against local distortions, the Gabor wavelets have been successfully applied for gait recognition. Huang et al. [ 15 ] proposed a method for recognizing human identity by gait features based on Gabor wavelets and modified gait energy images and Abdullah and El- Alfy [ 16 ] proposed a statistical gait recognition approach based on the analysis of overlapping Gaborbased regions. In order to overcome the shortcoming of high dimension feature of the traditional Gabor feature, a gait recognition method based on integrated Gabor feature was proposed by Shao and Hong [ 17 ], in which the active region Gabor feature images were integrated in a multi-scale and multiangle way by means of mean fusion and differential binary encoding methods. Meanwhile, a method based Gabor wavelets and (2D)2PCA (two-dimensional 2DPCA) was also proposed by [ 18 ] to reduce high feature dimension.
In this paper, we present a novel gait recognition algorithm based on fusing the features of GEI dynamic region with Gabor features. The gait features are extracted first from GEI using Gabor wavelets. Furthermore, the improved KPCA method is adopted to reduce the feature matrix dimension. A SVM is employed at last to identify the gait sequences. Experimental results show that the proposed algorithm can greatly reduce the feature matrix dimension and improve recognition accuracy rates.
The rest of the paper is organized as follows. In Section 2, we present a short description of the image preprocessing and cycle detection. Section 3 describes the proposed method based on feature fusion of GEI dynamic region and Gabor. In Section 4, experimental methods and results analysis are discussed. Finally, conclusions and future works are drawn shortly in Section 5.
2. Preprocessing and Cycle Detection
The results of image preprocessing may directly affect gait feature extraction in the gait recognition. Due to the influence of light, shelter and other external factors of gait image, some problems such as loss of information, image shadows, and improper threshold of image preprocessing may occur. In order to solve these problems, it is necessary to preprocess the image of gait recognition. The paper mainly follows these steps to extract the body image silhouette.
Step 1 (Background reconstruction): Since the scene is almost static in the whole video sequence and background corresponds to low-frequency information, average of pixels in image sequence can be used to estimate static background.
Step 2 (Moving object detection): Background subtraction is used to detect moving object of image sequence and max-entropy threshold method is used to process image binarization. After background is modeled, body silhouette will be extracted by the background subtraction method, in which the common way is that the current gait frame is first subtracted with the modeled background frame to get a difference and then compare the difference with the preset threshold. The frame which difference is greater than threshold is used as body silhouette and frame less than threshold as background. After subtraction, the method process image binarization and get the initial gait silhouette.
Step 3 (Morphology post-processing): After image binarization, the erosion and dilation operator of mathematical morphology is used first to remove noise and eliminate small cavities of image, and then the method of sticking the tab and analyzing connected domain is used to separate out the complete object.
Step 4 (Normalization of binary image): In order to reduce computational complexity, remove redundant information and eliminate the influence of the inconsistency of body silhouette caused by the change of camera focal length, the paper process object silhouette image normalization to scale the image to a uniform size.
In cycle detection, the gait is the data of periodic change, and width and height of the silhouette of the gait sequence are a periodic change process. There are several existing approaches that can accurately detect gait cycle. In our algorithm, we use the approach in [ 19 ], which first calculate and filter the pixels number of the lower half of the silhouettes in the whole sequence by a median filter. The number of frames in one gait cycle is calculated from the filtered signal.
3. The Proposed Algorithm
3.1 Features of the Body Dynamic Parts in GEI
GEI is the cumulative energy diagram of a complete gait cycle time after normalized. Luminance value of each pixel reflects the frequency in the body over a period of pixel. Given the sequence of gait silhouette images, GEI is defined as follows:
where N is a sequence of frame in a gait cycle; t represents the number of frames; x, y represent the two dimensional plane coordinate values of the image. Fig. 1 shows the GEI of 90o within a gait cycle.
GEI of three conditions at the view of 90o: (a) normal, (b) bag, and (c) cloth.
However, from Fig. 1, we can find the outermost outlines of the human body have changed when human is walking at conditions of carrying backpacks and other objects. Since the changes on top part of the body are very little and the lower part of the body (such as leg) has obvious change in normal, carrying bag and clothing conditions, the paper extracts the dynamic parts of the body as gait feature. Firstly, according to the anatomical model, the heights of a human’s pelvis is equal to 48% of the total body height, we divide the leg area from GEI. Then according to the spacing between the feet, the leg area is further divided and dynamic region of the body is gotten. The size of selected dynamic region in the paper is about 48×39 (in pixels). Fig. 2 showed that extraction of the dynamic regions of the body can eliminate effectively impact of carrying bag and clothing conditions on gait recognition.
Extraction of GEI dynamic region.
3.2 Improved Orientation Feature Extraction Based Gabor Wavelets
The characteristics of the Gabor wavelets (filters), especially for frequency and orientation representations, are similar to those of the human visual system, and are particularly appropriate for human perceptive representation and discrimination. The Gabor filters-based features, directly extracted from gray-level images, have been successfully and widely applied to gait fingerprint recognition. However, the dimension of the Gabor feature vector is very high when multiple scales and orientations are adopted. For example, if the size of an image is 64×64, and 3 scales and 8 orientations are selected, the dimension of the Gabor feature vector will reach 98,304 (64×64×3×8). It is difficult to calculate such a high dimension feature vectors [ 20 ]. Therefore, in this paper, we propose an improved scheme for the feature extraction of gait images. The improved methods use 2D Gabor filter method of [ 21 ], which is defined as:
where f in (2) and θ in (3), respectively represent the frequency and orientation of the filter, σ x and σ y are Gaussian envelope constant. The feature information of image orientation can be derived by changing θ parameter. The Gabor wavelets characteristic value of gait image I is:
where * is operation of a convolution, I(z) is gray value of z=(x, y) in gait image, g(z) is the coefficient of Gabor filter for angle parameter, and Gabor(z) is the Gabor features of gait image after Gabor filtering. The amplitude characteristics G of image in z=(x, y) is:
Our experimental database is a publicly available CASIA gait database provided by Institute of Automation, Chinese Academy of Sciences [ 22 ], which is large multi-view gait data set and contain 124 subjects. Therefore, the extraction of more features details at multi-view angles is crucial for recognition. Gabor filter can extract the orientation features well, and make these features have good discrimination abilities through appropriate selection of orientation parameters. Since the same one gait features have subtle difference at different view angles, the paper just only extract orientation features of GEI dynamic region and don't have to construct a Gabor filter with 3 scales and 8 orientations. The paper defines the Gabor filter with one scale and multiple orientations as the filter and selects the angle parameters based on the angle between the pedestrian and the camera. The expansion of the filtered image Gabor(z) by column will be as orientation feature vector in the features selection, also as Gabor wavelets feature of GEI dynamic regions.
3.3 Feature Fusion by Averaging Method
The fusion features selection in the paper is to make better use of characteristics of local information captured by Gabor wavelet, such as spatial frequency (scale), spatial location and orientation with one scale and multiple orientations. Compared with traditional method of multiple scales and multiple orientations, our method can greatly reduce feature dimension, data redundancy and computation complexity. The first selection for fusion features is feature of GEI dynamic regions, which can represent clearly characteristics of frequency change and speed of the various body parts in body movements, and second selected feature is Gabor features with multiple orientations, which can get gait features at different view.
Tax et al. [ 23 ] concluded that the combining result by using averaging method is superior to result by multiplying. To get better recognition rate, the paper adopts averaging method to fuses features of GEI dynamic region with features of Gabor wavelets on feature layer. The proposed algorithm first gets gait silhouette image based on gait sequence, calculates GEI sequence and separate the dynamic region. The second is to expand dynamic region by column to one-dimension vector as feature 1 and expand Gabor wavelets features of dynamic region by column to one-dimension vector as feature 2. At last, according to averaging rule, the weighted sum of feature 1 and feature 2 is as the characteristic features, which have the advantages of both dynamic and static features and can fix a single feature deficiency.
3.4 Dimension Reductions Using the Improved KPCA
The dimension reduction is an important step to improve the time complexity for the overall framework. To improve feature extraction capability of KPCA, the paper designs a new kernel function K(x, y) = mK1(x, y) + nK2(x, y) , which is a combination of Gaussian kernel function K1(x, y) and polynomial kernel function K2(x, y) , where m and n represent the contribution of a single kernel function to the fusion kernel. The gait recognition is a complicated process, not only the whole body silhouette features, but also gait local features have to be involved in the parameter selection. Therefore, the paper adopts the combination of Gaussian kernel function which has good local learning ability and polynomial kernel function which has better global generalization ability. The main steps of the improved KPCA algorithm are described in Algorithm 1.
Algorithm 1.
Improved KPCA algorithm
3.5 Gait Recognition Using SVM
SVM is a powerful machine learning technique based on statistical learning theory. The traditional SVM is a two-class classifier. There are two approaches to solve the n-class problem with SVM: the oneagainst- one approach and the one-against-rest approach [ 24 ]. Since the dimension of the selected feature subset is relatively small, the paper adopts one-against-one approach. The steps to construct the gait classifier are as follows:
SVM three classification diagram.
Some examples from CASIA dataset A (a), dataset B (b) and dataset C (c).
Step 1: Suppose there are m types of human gait to be classified, labeled as S 1 , S 2 , … Sm . During the training process, since a SVM classifier is first constructed between any two kinds of gait samples, m class gait samples require to construct m(m–1)/2 SVM classifiers f i (i = 1, 2, …, m(m–1)/2) .
Step 2: During the testing process, when classifying an unknown gait sample S j (j = 1, 2, …, m) , the class that gets the most votes at last is the class of the unknown gait.
Fig. 3 is an example to illustrate the three classes SVM. “1 and 2” represent the SVM constructed by class 1 and class 2 samples, 1 and 2 on left and right sides is the classification results of two classes. When output of “1 and 2” and “3 and 4” classifier is 1, it is classified as the class 1, and when the classification results of three SVM classifiers are different, the vote number of every class plus 1 and the total vote number is counted at last. Fig. 3 shows the class 1 has the most votes and the gait sample to be classified is classified as the class 1.
4. Experiments and Analysis
In this section, the effectiveness of the proposed algorithm is evaluated by experiments. The experiments are implemented on the publicly available CASIA gait database [ 22 ] (Fig. 4). In CASIA, there are 10 sequences for each subject, 6 sequences of them for normal walking (normal), 2 for walking with bag (bag) and 2 for walking in coat (clothing). In experiments, the LIBSVM tool is used, which is a simple, easy-to-use and fast and efficient SVM pattern recognition package. In the process of using LIBSVM, we first set SVM type of LIBSVM as C_SVC and kernel function as radial basis function. The recognition rate of the algorithm is evaluated by cross-validation. In order to compare the performance of different gait recognition algorithms, the correct classification rate (CCR) is used as an evaluation index [ 25 ].
In the first experiment, the algorithm is tested on the sequence of 124 subjects from 11 views. In the 124 subject gaits, we select randomly the image of three sets of the normal gait sequence of CASIA at every angle as the training set and the remaining three group image as the testing set. The experiment results are shown in Table 1. In Table 2, we use the same gait database of 124 subjects at 90° and compare recognition rate of single feature methods with feature fusion method by cross-validation.
Recognition rate of 9 different algorithms
Recognition rate of single and fusion feature algorithms
Another experiment is to validate the robustness of the algorithm in clothing and bagging conditions. The experiment selects first three sets of the normal gait sequence, first one walking group with bag and first one walking group in coat at 90° as the training set, and testing set is the remaining sets of the normal, clothing and bag group. Table 3 shows the recognition rate of 6 other algorithms and our proposed algorithm.
Comparison of different algorithms at three conditions
In the first experiment, the average recognition rate of eight other algorithms is compared with our proposed algorithm in different view angles at normal conditions. As seen from Table 1, the mean recognition rate of KPCA and PCA is almost the same poor. In KSPP and PCA+SPP algorithm, since KSPP improved the neighborhood of three kinds gait, and SPP preserved well local information of original data during the dimension reduction procedure while preserving the maximum sparsity in coefficient matrix, the poor recognition rate at some angles (such as 18°, 36°, and 144°) in KPCA and PCA are improved. When AEI is selected as gait features, the average recognition rate of AEI+2DLPP is inferior to our proposed algorithm, which proves extracting the dynamic region of GEI as gait feature is effective. The experimental result of fusion function of AEI and GEI in Table 1 is greatly superior to single AEI, which proves the feature fusion algorithm can improve performance of gait recognition. Furthermore, we can see from the Table 1 that our proposed algorithm can get at least more than 90% recognition rate at 10 views in all 12 view angles, which testifies that the method based on fusion of features of GEI dynamic regions and Gabor features can identify better the different human gait and get better recognized effect than other algorithms in the small sample gait database. As seen from Table 2, the gait recognition rate of feature fusion algorithms is higher about 10% than single-feature methods and the proposed algorithm is best of all features fusion algorithms. In addition, the data of Tables 1 and 2 show that the average recognition rate based SVM classifier is 94.35% and 93.29%, which illustrate validity of SVM in classification of gait sequences.
From the second experiment, we can see that the average recognition rate of proposed algorithm is better than other methods even in clothing and carrying bag conditions, which proves that the proposed algorithm can eliminate influences of clothing and carrying bag and have good robustness. In addition, although the classification rates of PCA+SPP, KSPP and the proposed method are very similar, but the former methods both use linear programming techniques in the sparse reconstruction process, the training time is much longer than the proposed method.
5. Conclusion
In this paper, a novel gait recognition algorithm based on Gabor wavelets and the improved KPCA is proposed to extract gait features. Compared to existing gait recognition methods, the proposed algorithm is demonstrated to have lower complexity, less training time, robustness and higher classification rates. The GEI of the algorithm can preserve the important information such as walk frequency, contour and phase information, while Gabor features can get the key gait features at different views. Meanwhile, the improved KPCA algorithm significantly reduces the feature space dimension to save the training time, and the one-against-one SVM can classify efficiently gait sequences. Experimental results show that the proposed method achieves higher recognition accuracy with less computational time than eight other existing approaches. The robustness of the proposed algorithm is demonstrated by testing on various viewing angles and rank numbers.
One limitation of the proposed algorithm is that the improved KPCA select Gaussian and polynomial kernel function to fuse, which may cause the gait recognition rate inferior to the common KPCA methods at some view angles. We will study some more robust kernel functions for different view angles in the future.