The Automatic Object Detection System Based on TLD Framework
Abstract: With the enhancement of the data processing ability of computer, the technology on sensor, audio and automation control has been developed continuously, and the information in video frames and image has got a lot of attention, which is one of the main sources that human obtain information from the world. Computer vision, as one of the present research upsurges, has many technical challenges such as detection, motion, scene reconstruction and image restoration. Object detection is one of the most important challenges. Although there are plenty of object detection systems with high accuracy rate of detection in the market, they lack realization on auxiliary functions so that they provide poor experience on man-machine interaction. Therefore, many developers focus on the topic that how to design a better man-machine interaction of detection system for human so that the detection system can be accepted widely. In this paper, we propose a system framework which contains the technology on object detection and voice processing. Firstly, we make improvement on the algorithm of Tracking-Learning-Detection (TLD). We use the image sets of the object which we want to detect to get a suitable classifier by training algorithm. Then, we can use the classifier to determine whether the new object is the target object and get the aim of detecting the specified object. Then, the system contains the module of speech recognition for a better man- machine interaction so that the user can add the image data to the data set and update the classifier by voice. In order to guarantee the accuracy of speech recognition, we use the Dynamic Time Warping (DTW) to match the phonetic characteristics.
文章引用: 李学彦 , 王春南 , 谢敏 , 王昌栋 (2016) 基于TLD的物体自动识别系统。 计算机科学与应用， 6， 248-264. doi: 10.12677/CSA.2016.64031
 Posdamer, J.L., et al. (1981) Computer Geometric Modeling for Machine Perception of Three-Dimensional Solids. Technical Symposium East. International Society for Optics and Photonics, 29 October 1981.
Engel, F.L. (1977) Visual Conspicuity, Visual Search and Fixation Tendencies of the Eye. Vision Research, 17, 95-108.
Collins, R., Lipton, A., Fujiyoshi, H. and Kanade, T. (2001) Algorithms for Cooperative Multisensor Surveillance. Proceedings of the IEEE, 89, 1456-1477.
Kalal, Z., Mikolajczyk, K. and Matas, J. (2012) Track-ing-Learning-Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 1409-1422.
 Bradski, G.R., et al. (2014) Learning Open CV. Oreilly Me-dia.
 Luo, J.W. (2009) Program Design and Implementation of Voice Based on Microsoft Speech SDK. Bulletin of Advanced Technology Research, 3, 22-25.
Koller, D., Weber, J. and Malik, J. (1994) Robust Multiple Car Tracking with Occlusion Reasoning. Proceedings of 3rd European Conference on Computer Vision (ECCV’94), 800, 189-196.
Gori, F., Santarsiero, M., Piquero, G., Mondello, A. and Simon, R. (2001) Partially Polarized Gaussian Schell-Model Beams. Journal of Optics: A Pure and Applied Optics, 3, 1-9.
 Comaniciu, D., Ramesh, V. and Meer, P. (2003) Kernel-Based Object Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 564-577.
Barron, J., et al. (1992) Performance of Optical Flow Techniques. Proceedings of the International Conference on Computer Vision & Pattern Recognition, Champaign, 15-18 June 1992, 236-242.
 VTB (2013) Visual Tracker Benchmark. http://www.visual-tracking.net
 VOT (2013) Visual Object Tracking. http://www.votchallenge.net
Wu, Yi, et al. (2013) Online Object Tracking: A Benchmark. Proceedings/CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 9, 2411-2418.
 Su, S.Z., Li, S.Z., Chen, S.Y., Cai, G.R. and Wu, Y.D. (2012) Pede-strian Detection Technology Reviewed. Acta Electronica Sinica, 40, 814-820.