Hostname: page-component-8448b6f56d-sxzjt Total loading time: 0 Render date: 2024-04-25T00:23:07.163Z Has data issue: false hasContentIssue false

Scale-invariant optical flow in tracking using a pan-tilt-zoom camera

Published online by Cambridge University Press:  09 December 2014

Salam Dhou
Affiliation:
Department of Electrical and Computer Engineering, Virginia Commonwealth University, Richmond, VA 23284-3068, USA
Yuichi Motai*
Affiliation:
Department of Electrical and Computer Engineering, Virginia Commonwealth University, Richmond, VA 23284-3068, USA
*
*Corresponding author E-mail: ymotai@vcu.edu

Summary

An efficient method for tracking a target using a single Pan-Tilt-Zoom (PTZ) camera is proposed. The proposed Scale-Invariant Optical Flow (SIOF) method estimates the motion of the target and rotates the camera accordingly to keep the target at the center of the image. Also, SIOF estimates the scale of the target and changes the focal length relatively to adjust the Field of View (FoV) and keep the target appear in the same size in all captured frames. SIOF is a feature-based tracking method. Feature points used are extracted and tracked using Optical Flow (OF) and Scale-Invariant Feature Transform (SIFT). They are combined in groups and used to achieve robust tracking. The feature points in these groups are used within a twist model to recover the 3D free motion of the target. The merits of this proposed method are (i) building an efficient scale-invariant tracking method that tracks the target and keep it in the FoV of the camera with the same size, and (ii) using tracking with prediction and correction to speed up the PTZ control and achieve smooth camera control. Experimental results were performed on online video streams and validated the efficiency of the proposed method SIOF, comparing with OF, SIFT, and other tracking methods. The proposed SIOF has around 36% less average tracking error and around 70% less tracking overshoot than OF.

Type
Articles
Copyright
Copyright © Cambridge University Press 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1. Lowe, D. G., “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60 (2), 91110 (Nov. 2004).CrossRefGoogle Scholar
2. Liu, C., Yuen, J. and Torralba, A., “SIFT flow: Dense correspondence across scenes and its applications,” IEEE Pattern Anal. Mach. Int. 33 (5), 978994 (May 2011).Google Scholar
3. Liu, C., Yuen, J., Torralba, A., Sivic, J. and Freeman, W. T., “SIFT Flow: Dense Correspondence Across Different Scenes,” In: Proceedings of ECCV 2008, Lecture Notes in Computer Science, Springer Berlin Heidelberg, Vol. 5304 (Oct. 2008) pp. 2842.Google Scholar
4. Yao, Y., Abidi, B. and Abidi, M., “3D target scale estimation and target feature separation for size preserving tracking in PTZ video source,” Int. J. Comput. Vis. 82 (3), 244263 (May 2009).CrossRefGoogle Scholar
5. Micheloni, C. and Foresti, G. L., “Active tuning of intrinsic camera parameters,” IEEE Trans. Autom. Sci. Eng. 6 (4), 577587 (Oct. 2009).Google Scholar
6. Graetzel, C. F., Nelson, B. J. and Fry, S. N., “A dynamic region-of-interest vision tracking system applied to the real-time wing kinematic analysis of tethered drosophila,” IEEE Trans. Autom. Sci. Eng. 7 (3), 463473 (Jul. 2010).Google Scholar
7. Mikic, I., Trivedi, M., Hunter, E. and Cosman, P., “Human body model acquisition and tracking using voxel data,” Int. J. Comput. Vis. 53 (3), 199223 (Jul.–Aug. 2003).Google Scholar
8. Jang, D. S., Jang, S. W. and Choi, H. I., “2D human body tracking with structural Kalman filter,” Pattern Recognit. 35 (10), 20412049 (Oct. 2002).Google Scholar
9. Schulz, D., Burgard, W., Fox, D. and Cremers, A. B., “People tracking with mobile robots using sample-based joint probabilistic data association filters,” Int. J. Robot. Res. 22 (2), 99116 (Feb. 2003).Google Scholar
10. Rosales, R. and Sclaroff, S., “A framework for heading-guided recognition of human activity,” Comput. Vis. Image Underst. 91 (3), 335367 (Sep. 2003).Google Scholar
11. Sun, X. D., Foote, J., Kimber, D. and Manjunath, B. S., “Region of interest extraction and virtual camera control based on panoramic video capturing,” IEEE Trans. Multimedia 7 (5), 981990 (Oct. 2005).Google Scholar
12. Wu, S. G. and Hong, L., “Hand tracking in a natural conversational environment by the interacting multiple model and probabilistic data association (IMM-PDA) algorithm,” Pattern Recognit. 38 (11), 21432158 (Nov. 2005).Google Scholar
13. Yun, X. P. and Bachmann, E. R., “Design implementation, and experimental results of a quaternion-based Kalman filter for human body motion tracking,” IEEE Trans. Robot. 22 (6), 12161227 (Dec. 2006).Google Scholar
14. Beymer, D. and Konolige, K., “Tracking people from a mobile platform,” Exp. Robot. VIII Springer Tracts in Adv. Robot. 5, 234244 (2003).Google Scholar
15. Horn, B. K. P. and Schunck, B. G., “Determining optical flow,” Artif. Intell. 17 (1–3), 185203 (Aug. 1981).Google Scholar
16. Lucas, B. D. and Kanade, T., “An Iterative Image Registration Technique with An Application to Stereo Vision,” Proceedings of the Seventh International Joint Conference on Artificial Intelligence, Vancouver, BC, Canada, (Aug. 1981), vol. 81, pp. 674–679.Google Scholar
17. Tomasi, C. and Kanade, T., “Detection and Tracking of Point Features,” Technical Report CMU-CS-91-132, Carnegie Mellon University, Pittsburgh, PA (Apr. 1991).Google Scholar
18. Shi, J. and Tomasi, C., “Good Features to Track,” Proceedigs of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94), Seattle, WA, USA (Jun. 1994) pp. 593–600.Google Scholar
19. Streit, R. L., Graham, M. L. and Walsh, M. J., “Multitarget tracking of distributed targets using histogram – PMHT,” Digit. Signal Process. 12, 394404 (May 2002).Google Scholar
20. Cheng, Y., “Mean shift, mode seeking, and clustering,” IEEE Trans. Pattern Anal. Mach. Intell. 17 (8), 790799 (1998).Google Scholar
21. Comaniciu, D., Ramesh, V. and Meer, P., “Real-time tracking of non-rigid objects using mean shift,” IEEE Proc. Comput. Vis. Pattern Recognit., Hilton Head, SC, USA (2000) pp. 673–678.Google Scholar
22. Ido, L., “Mean shift trackers with cross-bin metrics,” IEEE Trans. Pattern Anal. Mach. Intell. 34 (4), 695706 (Apr. 2012).Google Scholar
23. Ido, L., Michael, L. and Ehud, R., “Mean shift tracking with multiple reference color histograms,” Comput. Vis. Image Underst. 114 (3), 400408 (Mar. 2010).Google Scholar
24. Bradski, G. R., “Computer vision face tracking for use in a perceptual user interface,” IEEE Workshop on Applications of Computer Vision, Princeton, NJ (1998), pp. 214–219.Google Scholar
25. Brox, T., Rosenhahn, B., Gall, J. and Cremers, D., “Combined region and motion-based 3D tracking of rigid and articulated objects,” IEEE Trans. Pattern Anal. Mach. Intell. 32 (3), 402415 (Mar. 2010).Google Scholar
26. Serby, D., Meier, E.-K. and Van Gool, L., “Probabilistic Object Tracking Using Multiple Features,” In: Proceedings of 17th International Conference on Pattern Recognition (ICPR 2004), IEEE, Cambridge, United Kingdom, Vol. 2 (2004) pp. 184–187.Google Scholar
27. Tarhan, M. and Altug, E., “A catadioptric and pan-tilt-zoom camera pair object tracking system for UAVs,” J. Intell. Robot. Syst. 61 (1–4), 119134 (Mar. 2011).CrossRefGoogle Scholar
28. Varcheie, P. D. Z. and Bilodeau, G. A., “Adaptive fuzzy particle filter tracker for a PTZ camera in an IP surveillance system,” IEEE Trans. Instrum. Meas. 60 (2), 354371 (Feb. 2011).CrossRefGoogle Scholar
29. Song, D., Xu, Y. and Qin, N., “Aligning windows of live video from an imprecise pan-tilt-zoom camera into a remote panoramic display for remote nature observation,” J. Real-Time Image Process. 5 (1), 5770 (2010).Google Scholar
30. Tordoff, B. and Murray, D., “Reactive control of zoom while fixating using perspective and affine cameras,” IEEE Trans. Pattern Anal. Mach. Intell. 26 (1), 98112 (Jan. 2004).CrossRefGoogle ScholarPubMed
31. Tordoff, B. and Murray, D., “A method of reactive zoom control from uncertainty in tracking,” Comput. Vis. Image Underst. 105 (2), 131144 (Feb. 2007).Google Scholar
32. Hutchinson, S. A., Hager, G. D. and Corke, P. I., “A tutorial on visual servo control,” IEEE Trans. Robot. Autom. 12 (5), 651670 (Oct. 1996).CrossRefGoogle Scholar
33. Chaumette, F. and Hutchinson, S., “Visual servo control, part I: Basic approaches,” IEEE Robot. Autom. Mag. 13 (4), 8290 (Dec. 2006).Google Scholar
34. Gans, N. R., Hu, G. and Dixon, W. E., “Keeping Multiple Objects in the Field of View of a Single PTZ Camera,” Proceedings of the 2009 American Control Conference (ACC '09) (Jun. 2009) pp. 5259–5264.Google Scholar
35. Chen, I.-H. and Wang, S.-J., “An efficient approach for the calibration of multiple PTZ cameras,” IEEE Trans. Autom. Sci. Eng. 4 (2), 286293 (Apr. 2007).CrossRefGoogle Scholar
36. Se, S., Lowe, D. G. and Little, J. J., “Vision-based global localization and mapping for mobile robots,” IEEE Trans. Robot. 21 (3), 364375 (Jun. 2005).Google Scholar
37. Zhou, H., Yuan, Y. and Shi, C., “Object tracking using SIFT features and mean shift,” Comput. Vis. Image Underst. 113 (3), 345352 (Mar. 2009).CrossRefGoogle Scholar
38. Chen, A. H., Zhu, M., Wang, Y. H. and Xue, C., “Mean Shift Tracking Combining SIFT,” In: Proceedings of the 9th International Conference on Signal Processing, 2008, Beijing, China, Vol. 1–5 (2008) pp. 1532–1535.Google Scholar
39. Cui, Y., Hasler, N., Thormaehlen, T. and Seidel, H. P., “Scale Invariant Feature Transform with Irregular Orientation Histogram Binning,” In: Proceedings of International Conference on Image Analysis and Recognition, Lecture Notes in Computer Science, Springer Berlin Heidelberg, Vol. 5627 (2009), pp. 258–267.Google Scholar
40. Lee, H., Heo, P. G., Suk, J. Y., Yeou, B. Y. and Park, H., “Scale-invariant object tracking method using strong corners in the scale domain,” Opt. Eng. 48 (1), 017204–017204-9 (Jan. 2009).Google Scholar
41. Bay, H., Tuytelaars, T. and Gool, L. V., “SURF: Speeded Up Robust Features,” In: Proceeding of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, Springer Berlin Heidelberg, Vol. 3951 (May 2006), pp. 404–417.Google Scholar
42. Bregler, C. and Malik, J., “Tracking People with Twists and Exponential Maps,” In: Proceedings of the IEEE CS Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, USA, (1998), pp. 8–15.Google Scholar
43. Bregler, C., Malik, J. and Pullen, K., “Twist-based acquisition and tracking of animal and human kinematics,” Int. J. Comput. Vis. 56 (3), 179194 (Feb. 2004).CrossRefGoogle Scholar
44. Khan, Z. H. and Gu, I. Y. H., “Joint feature correspondences and appearance similarity for robust visual object tracking,” IEEE Trans. Inf. Foren. Secur. 5 (3), 591606 (Sep. 2010).Google Scholar
45. Churchill, D. and Vardy, A., “Homing in Scale Space,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Nice, France (Sep. 2008), pp. 1307–1312.Google Scholar
46. Venkateswar, V. and Chellappa, R., “Hierarchical stereo and motion correspondence using feature groupings,” Int. J. Comput. Vis. 15, 4569 (1995).Google Scholar
47. Morita, T. and Kanade, T., “A sequential factorization method for recovering shape and motion from image streams,” IEEE Trans. Pattern Anal. Mach. Intell. 19 (8), 858867 (Aug. 1997).CrossRefGoogle Scholar
48. Polhemus Documentation: Available at: http://www.polhemus.com/?pa-ge=Motion_Liberty. [Accessed 10 August 2010].Google Scholar
49. Chatfield, C., “Prediction intervals for time-series forecasting,” Principles of forecasting. Springer US, (2001), pp. 475494.Google Scholar
50. Lim, S. and El-Gamal, A., “Optical Flow Estimation Using High Frame Rate Sequences,” In: Proceeding of the IEEE 2001 International Conference on Image Processing (ICIP), Thessaloniki, Greece (Oct. 2001), Vol. 2, pp. 925–928.Google Scholar
51. Liu, C., Yuen, J. and Torralba, A., “Nonparametric Scene Parsing: Label Transfer via Dense Scene Alignment,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, (Jun. 2009) pp. 1972–1979.Google Scholar
Supplementary material: PDF

Dhou and Motai Supplementary Material

Supplementary Material

Download Dhou and Motai Supplementary Material(PDF)
PDF 123.7 KB