1. Introduction
The number of urban motor vehicles has increased dramatically with the rapid development of the economy. This has caused the steady deterioration in traffic conditions. The road traffic motoring system can not only accurately grasp the traffic condition information, such as traffic flow, vehicle speed, road occupation rate, etc., but also can understand the behaviors of vehicles on the road, such as driving in the wrong direction, running through red lights, traffic accidents, etc. Among them, traffic flow detection is the key point in this field.
According to the operating mode and the operating range of the wavelengths of electromagnetic waves, the detection systems of traffic flow are divided into three types: magnetic traffic flow detection system [1], ultrasonic traffic flow detection system [2] and video traffic flow detection system [3]. The magnetic traffic flow detection system causes the change of the magnetic field of the coil when it is buried in the underground toroidal coil, and the detector calculates the parameters such as the flow rate and speed. The detection system is mature, easy to master and low in cost. However, it is easy to damage, has a short service life and high maintenance costs. Ultrasonic traffic flow detection system is made by adopting the principle of ultrasonic ranging. Ultrasonic detectors have a very short service life in the harsh environment such as dusty traffic intersections, and are susceptible to false alarms due to the vibration caused by wind. Compared with other two systems, the advantages of the video traffic flow detection system could be summarized by lower cost, easier installing and inspecting [4], so it becomes the main trend. The video traffic flow detection system is a comprehensive system which requires the integration of image processing and information management. After acquiring the video signals using cameras, the video signals are changed to digital images by special image capture card [5]. The special image capture card integrates the functions of traffic video capture, vehicle tracking and vehicle type recognition together. Then the computer processes the digital image in real time to detect the parameters, which are communicated to traffic control center by radio communication or computer network. The system flowchart is shown in Fig. 1.
Flow chart of the video traffic flow detection system.
Block diagram of this method.
The traffic flow detection system is an important part of the road traffic motoring system. It is used to test and collect traffic parameters, such as the vehicle flow rate, type, speed, road occupancy and so on, of the highway and urban road. As we all know, these traffic parameters are effective evidences to control traffic access and key factors to measure the traffic quality [6,7]. Recently, researching for detection system for traffic parameters has received a lot of progress and different detection systems have emerged.
The video traffic flow detection system which integrates the functions of traffic video capture, vehicle tracking and vehicle type recognition together is the terminal node of the intelligent traffic system (ITS) [8], the realization of urban intelligent traffic management requires large-scale use of similar systems. At the same time, the system needs to be more stable and cost-effective to adapt to harsh outdoor environments and has the characteristic of uninterrupted operating. Considering the efficiency and cost, a video traffic flow detection method based on machine vision is proposed. First, three-frame difference method is used to estimate the motion and the initial background is established. Then, the background image is updated by the statistical scoring strategy. At the same time, a simple and effective shadow elimination method is introduced to improve the accuracy of vehicle segmentation detection. In addition, a vehicle matching and tracking strategy is proposed based on vehicle location, color and fractal dimension information. The flow of the whole method is shown in Fig. 2.
2. The Improved Background Difference Method
In recent years, the researchers have developed many methods for traffic flow detection which can be categorized to the following four types: the HOG feature [9], the cascade classifier method [10], the deep learning frameworks [11], and image difference method [12,13]. Compared with the fourth one, the former three methods have the good characteristics of high precision and accurate location but it has no use for background constructing. However, these three methods are complicated and difficult to handle in real time on a common hardware platform at present. That is why the last method which is the image difference method is widely used in traffic flow detection system.
In the practical application, the image difference method can be grouped into two types: the one is based on the frame difference of two adjacent frames; the other is based on the difference between the current frame and the built background frame. The first method causes blank holes and could not detect temporarily stayed objects, which will affect the traffic flow detection accuracy [14]. The effect of the second method is based on the assumption that the acquisition device is fixed and the little changing of the background is in a short time. It can overcome some problems caused by the adjacent frame difference method, but this method needs to construct the background in real time [15].
According to the analysis of the above method, combining characteristics of traffic flow detection, the image background difference method is adopted as the basis of the detection method. In view of the background quality has large effect on the system performance, we develop an improved image background difference method, including background generation and background update.
2.1 Background Image Generation
By analyzing the background and foreground grayscale features in traffic scenes, we propose a motion estimation method which using adjacent three-frame subtraction to construct the initial background.
Suppose that[TeX:] $$\left\{I_{i, j}^{t}\right\}$$ denotes the number t frame incoming grayscale images whose height is H and width is W. Where [TeX:] $$I_{i, j}^{t}$$ is the gray value of the pixel in line [TeX:] $$i(0 \leq i \leq H-1) \text { and column } j(0 \leq j \leq W-1).$$ Given three continuous images are the No. t-1, t, t+1 frame, then the background image and foreground object are [TeX:] $$\left\{B_{t j}^{t}\right\},\left\{O_{t j}^{t}\right\}.$$ Hence the processes are shown as following:
1) The three continuous images are divided into two groups: one is the No. t-1 frame and the No. t frame; others are the No. t and the No. t+1. Then the inter-frame gray difference operation is carried out respectively to each pixel of the two videos. The absolute values are saved in [TeX:] $$\left\{D_{i, j}^{-1}\right\} \text { and }\left\{D_{i, j}^{+1}\right\}:$$
2) Each value included in [TeX:] $$\left\{D_{i, j}^{-1}\right\} \text { and }\left\{D_{i, j}^{+1}\right\}$$ is supposed to be compared with a pre-set threshold [TeX:] $$T_{0},$$ as to a point (i,j), the related values in [TeX:] $$\left\{D_{i, j}^{-1}\right\}_{\text {and }}\left\{D_{i, j}^{+1}\right\}$$ are all bigger than [TeX:] $$T_{0}.$$ So, it can be judged that this point is moving within current three continuous images. Therefore, it can be fitted into the foreground image [TeX:] $$\left\{O_{i, j}^{t}\right\}:$$
AND is defined as logical “AND”.
3) Combining the foreground image [TeX:] $$\left\{o_{t, j}^{t}\right\},$$ the left part is the background image [TeX:] $$\left\{B_{i, j}^{t}\right\}$$ extracted from the No. t frame, which clears all of the pixels [TeX:] $$O_{i, j}^{t}=255$$ from the input image of the No. t frame:
4) Duplicate the above operations for the initial N-frame images, and make statistics of the background image and get as well as the former, a complete initial background image could be got.
Fig. 3 shows an initial background image built at the common city traffic scenes which the N is in different value.
The initial background images: (a) N=30, (b) N=80, and (c) N=200.
It is obvious that the quality of initial background images largely depends on the choices of N. However, it may result in the incompleteness of background if N is a small value, and the detection of traffic flow parameter will be affected. On the other hand, if N is a large value, it may take a long time to build the background, and lead to an ineffective system. From the experimental data, the regular pattern of N value according to the traffic scene is summarized. For ordinary highways, the value of N should be equal to 100 which could make the background images meet the testing requirements, because the vehicles are relatively sparse. For the ordinary city-traffic scenes, N equal to 150 is more appropriate. Meanwhile, N usually takes more than 200 to build up the initial background images to satisfy the requirements when the vehicles are extremely dense, such as in the national highway toll stations, the intersections of urban arterial roads, and so on.
2.2 Background Image Updating
In order to update the backgrounds according to the factors of different weather condition and light environment and avoid the misjudgement in the vehicles extracting and tracking in the system, a statistical scoring method is proposed to update the backgrounds in real time.
Supposing that [TeX:] $$\left\{S O_{i j}\right\} \text { and }\left\{S N_{i j}\right\}$$ are the two background updating scoring devices about the pixel [TeX:] $$(i, j) \cdot\left\{S O_{i j}\right\}$$ records the score of the pixels in the background image, while [TeX:] $$\left\{S N_{i j}\right\}$$ records the score of the pixels in the current input image. The scoring action depends on the number that pixels occur. When the score recorded by the pixel scoring device is greater than that of background updating threshold, this pixel gray is supposed to replace the old gray pixel corresponding to the background. Otherwise, the original background remains unchanged. The new strategy of the background updating can be described as:
[TeX:] $$T_{1}$$ is background updating threshold.
Fig. 4 describes the whole process that background updating method works self-adaptively in real-time when the vehicle has a short stay in the video acquisition area. The figure also shows that the method can capture the variations in the background environment rapidly and accurately so it enhances the robustness of background difference.
Background update: (a) initial background, (b) shortly stayed vehicle is driving into the background, (c) shortly stayed vehicle drives into the background completed, (d) shortly stayed vehicle starts to drive out of the background, (e) shortly stayed vehicle mostly drives out of the background, and (f) shortly stayed vehicle drives out of the background mostly.
2.3 Background Extraction Effects Contrast
Fig. 5 shows the background extraction effects for two traffic scenes by method in this study.
Background extraction effect for different scenes: (a) foreground of scene 1, (b) background of scene 1, (c) foreground of scene 2, and (d) background of scene 2.
In order to test the background extraction effect of this method, we compared the method of this paper with mixture Gaussian background model (MoG) method [16], improved adaptive kernel density estimation (MAKDE) method [17] and hierarchical code estimation (HCB) method [18]. Fig. 6 shows the difference images between original images and background images.
Because of camera vibration, the pixels of road isolation fence which is part of the scene 1 look like ripple. The experiment results show that this method and MAKDE method could effectively suppress the emergence of ripple pixels. The left position in scene 2 has many pedestrians and the greenbelt is not motionless because of the wind, so it would produce background interference in these cases. Fig. 6 shows that this method could effectively build the background model and has good moving object region detection results.
Background extraction method contrast: (a) MoG method, (b)HCB method, (c) MAKDE method, and (d) proposed method.
3. Moving Object Segmentation and Shadow Elimination
The system tracking and recognizing moving objects are based on the analysis of the foreground image. So the first detection step is to divide the moving objects (including motor vehicles and pedestrians) from backgrounds and show them in the form of binary image.
3.1 Moving Object Area Extraction
After the initial background image is produced, the background difference method can be adopted to extract moving object region in the current frame. It calculates the absolute value of the result that the gray value of each pixel in the current input image minus gray value of the corresponding pixel in the background image.
Then the corresponding binary difference image can be calculated as follows:
Here sets a threshold [TeX:] $$T_{2}.$$ The pixel value T2 greater than belongs to the moving object area, otherwise it belongs to the background.
Fig. 7 describes the process of extracting the moving object area.
The process of extracting the moving object area: (a) the current input image, (b) the generated background image, (c) the difference image, and (d) the binary difference image.
3.2 Moving Cast Shadows Elimination
For the scene that be captured in daytime with sunshine, the cast shadows along with the moving vehicles are included in the extracted motion object area by the background difference method. If these shadows are not eliminated, the accuracy of traffic detection parameter will be decreased. Based on the work [19,20], a simple and effective shadow estimation method is proposed.
1) The pre-extraction of the moving cast shadows
The moving cast shadows have two essential properties. One is that its brightness is lower than the brightness of the surroundings. The other is that the absolute value of the difference between the moving cast shadows pixels and background pixels is less than that between the vehicles pixels and background pixels. Therefore, the moving cast shadows could be preliminarily extracted from received moving object areas by threshold detection.
The pre-extracted moving cast shadows usually contain part of object areas except the real shadows. If they are all treated as shadows indiscriminately, it may cause the object areas empty or discontinuous, and influence the system parameter detection accuracy. So the shadow areas should be filtered and eliminated from the extracted moving object areas.
2) The edge detection and binary processing of moving object areas
We adopt the Sobel gradient operator to detect edge on the extracted moving object areas. The Sobel edge detection can be expressed mathematically as follow:
Corresponding binary edge images could be expressed as follow:
[TeX:] $$T_{5}$$ was defined as the binary threshold.
3) The edge pixel detection in moving cast shadows
After get the binary edge image by using Eq. (2), we develop a new kind of shadow edge pixel detection operator based on the pixel neighborhood property. The detailed description is given as follows:
When we move a 5×5 window in the binary edge image, the convolution operation would be taken respectively on the window and four 1-dimentional Laplace operators if the gray value of the window center pixel is 255. Each of the four operators is sensitive to a direction respectively. The maximum of the four convolutions absolute value can be used to judge whether the window center pixels are the moving cast shadows edge pixels. It can be expressed as follow:
[TeX:] $$K_{p}$$ is defined as the no. p convolution operators.
Fig. 8 describes the moving cast shadows elimination processes and shows that this method can eliminate the moving cast shadows effectively, especially for the large cast shadows. Combined above three steps, the cast shadows eliminated moving object area can be re-extracted.
The processes of elimination of moving cast shadows: (a) the pre-extraction of the cast shadows, (b) the binary edge image, (c) the elimination of the cast shadows edge pixels, and (d) the binary image of object areas.
3.3 The Contrast between Shadow Elimination Methods
In order to further show the effects of shadow elimination methods, Fig. 9 shows the shadow elimination results for scene 1 and scene 2 by the method of this paper, kernel-based learning (KBL) method [21] and adaptive shadow estimation (ASE) method [22]. The comparison shows that KBL method has weak detection effect and many little regions are falsely detected to shadows. While in ASE method, some of the shadow regions are falsely detected to moving objects. However, the method of this paper has the best detection effect.
The study [23] proposed the parameters of shadow detection rate and object detection rate to indicate shadow elimination effect. The definitions of them are shown as follows:
[TeX:] $$T P_{s} \text { and } T P_{f}$$ are the correct shadow and foreground pixels numbers. FNS and FNf are respectively the wrong shadow and object pixels numbers. Table 1 shows the shadow elimination characteristics comparison of KBL method, ASE method and the method of this paper, the comparison results show that this method has better discrimination for object shadows than other methods.
Elimination processes of moving objects shadows: (a) the input image, (b) KBL method, (c) ASE method, and (d) proposed method.
Shadows detection efficiency contrast
4. Vehicle Tracking
4.1 The Setting of Surveyed Area
In order to reduce the complexity in the location and tracking processing and improve the detection accuracy, a virtual detection region was set in the traffic scene and those vehicles that be driven into the virtual detection region can be detected.
4.2 Vehicle Tracking
At present, most of the existing traffic flow detection systems do not have tracking function. They directly detect traffic flow according to the signal variety. It usually causes over counting if a vehicle crosses two traffic lanes or changes lanes, impacting the detection precision [24]. So this system adds the process of vehicle tracking.
Vehicle tracking is the process to ensure the time-space change of vehicles. It requires that the system must map the detected vehicles between the adjacent two frames and two objects have maximum matched feature could be the same vehicle’s shadows stayed in the adjacent two frames. Obviously, the vehicle tracking effect depends on the choice of the vehicle characteristics which must accurately reflect the spatial and time feature [25] of vehicle. Meanwhile, the vehicle tracking processing module must be simple and efficient as it would be called frequently in the system. And then, it requires that the chosen characteristics should be minimal as possible to reduce computation complexity.
The experiments show that the system can test various traffic flow parameters quickly and effectively, providing a solid foundation for traffic management automation.
1) Spatial location characteristic: The vehicle spatial location information reflects the driving conditions, and it is one of the most important characteristics. Because the acquisition device is fixed and all of the vehicle space position changes are shown in the video image [26,27]. The vehicle barycenter coordinate [TeX:] $$\left(C x_{m}^{t}, C y_{m}^{t}\right)$$ can be used as the spatial location characteristic. Where subscript m shows the characteristic belongs to the no. m vehicle.
2) Vehicle color characteristic: The vehicle color information can reflect the overall appearance characteristic which is a very important point [28]. The mean values [TeX:] $$E Y_{l}^{t}, \quad E U_{l}^{t}, \quad E V_{l}^{t}$$ of the vehicle three color components Y, U, V were used as the characteristics.
3) Vehicle fractal dimension characteristic: The fractal dimension can be used to describe the roughness of the vehicle shape.
After extracting the vehicles characteristics, the system could match vehicles between adjacent frames and calculate match degrees for any two vehicles. The two vehicles with the highest match degree would be selected. The match degree with the threshold would be compared. If the match degree is greater than the threshold, the vehicle tracking list would be updated at the same time. Otherwise, it is not match. Then the vehicles appear in the next frame would be considered as new targets which would be added to the tracking list as new vehicles.
5. Discussions
In traffic flow detection algorithm, the background construction technology and moving cast shadow elimination technology are key factors to ensure the traffic parameters accuracy. Fig. 10 shows the effectiveness of the algorithm working on images captured in daytime.
Demo in daytime: (a) the current input image, (b) the generated background image, (c) the shadows eliminated moving object area binary image, and (d) the vehicle tracking image.
The traffic flow detection system is an important part of the road traffic motoring system. It is used to test the highway and urban traffic parameters, such as the vehicle flow rate, type, speed, road occupancy and so on. The parameters are the effective evidences to control traffic access, and the important measure to keep traffic safety and smoothly. The video traffic flow detection system has become a hot research topic in the field of traffic flow detection in nowadays because of the following advantages. The easy installation, convenient maintenance, low investment cost, variable traffic parameters and better accordance with human subjective vision.
In view of the problems of the poor adaptability, low vehicle classification accuracy and lack of vehicle tracking function in the existing system, this study has put forward some new ideas in the detecting algorithm:
1) In consideration of that the constructed background quality greatly influences the system performance, a motion estimation method based on three consecutive frames differential method were developed to construct the initial background. Meanwhile, to update this background in real time, a statistical scoring method was adopted.
2) Moving cast shadows may be checked as parts of the vehicles and reduce the precision of traffic flow parameters detection, therefore this study put forward a simple and effective shadow elimination algorithm on basis of the moving cast shadows general property;
3) Since the existing system would cause over counting problem account of lack traffic tracking algorithm, this study proposed a novel matching tracking algorithm making use of vehicle location information, color information and fractal dimension information.
The experiments show that the system can test various traffic flow parameters quickly and effectively, providing a solid foundation for traffic management automation.
Acknowledgement
The authors are grateful to the National Natural Science Foundation of China (61861025) and Opening Foundation of Key Laboratory of Opto-technology and Intelligent Control (Lanzhou Jiaotong University), Ministry of Education.