Dynamic vehicle detection and tracking can provide essential data to solve the problem of road planning and traffic management. A method for real-time vehicle detection and tracking using deep neural networks is proposed in this paper and a complete network architecture is presented. Using our model, you can obtain vehicle candidates, vehicle probabilities, and their coordinates in real-time. The proposed model is trained on the PASCAL VOC 2007 and 2012 image set and tested on ImageNet dataset. By a carefully design, the detection speed of our model is fast enough to process streaming video. Experimental results show that proposed model is a real-time, accurate vehicle detector, making it ideal for computer vision application.
In today’s society, more and more vehicles are taking to the highways every year, which makes a push to monitor and control the traffic more efficiently. The real-time vehicle detection and tracing is essential for intelligent road routing, road traffic control, road planning and so on. Therefore, it is important to know the road traffic density real time, especially in mega cities for signal control and effective traffic management. For a long time, several approaches[1,2] in the literature have been proposed to resolve the problem of various moving vehicles; Nevertheless, the aim of real-time fully-automatic detection of vehicle is far from being attained as it needs improvement in detection and tracking for accurate prediction with faster processing speed. Zheng et al. use brake lights detection through color segmentation method to generate vehicle candidates and verify them through a rule-based clustering approach. A tracking-by-detection scheme based on Harris-SIFT feature matching is then used to learn the template of the detected vehicle on line, localize and track the corresponding vehicle in live video . It is a good measure to extract vehicle areas, however, it needs a relatively ideal background. Wei Wang et al. have presented a method of multi-vehicle tracking and counting using a fisheye camera based on simple feature points tracking, grouping and association. They integrates low level feature-point based tracking and higher level “identity appearance” and motion based real-time association . However, the average processing time of it is around 750ms, which is not fast enough to achieve the real-time processing. System based Convolutional Neural Networks (CNN) can provide the solution of many contemporary problems in vehicle detection and tracing. CNN currently outperform other techniques by a large margin in computer vision problems such as classification  and detection . The training procedure of CNN automatically learn the weights of the filters, so that they are able to extract visual concepts from raw image content. Using the knowledge obtained through the analysis of the training set containing labelled vehicle and non-vehicle examples, vehicle can be identified in given images. In general, Convolutional Neural Networks show more promising results. In this paper, we propose a method of real-time vehicles detection and tracking using Convolutional Neural Networks. We present a network architecture, which create multiple vehicle candidates and predict vehicle probabilities in one evaluation. Our architecture uses features from the entire image to create vehicle candidates. Firstly, we use convolutional layers of the system to extract features from the raw image. Secondly, we use four kinds of inception modules. Thirdly, we add Spatial Pyramid Pooling (SPP) layer between convolutional layers and fully connected layers, which is able to resize any images into fixed size. Lastly, the fully connected layers predict the probability and coordinates of vehicles.