Comparative analysis of image hashing algorithms for visual object tracking

Vitalii Naumenko, Sergiy Abramov, Vladimir Lukin

Abstract


Subject of the research – visual object tracking using various image hashing algorithms for real-time tracking tasks. The goal of this study is to evaluate the tracking success and processing speed of existing and new hashing algorithms for object tracking and to identify the most suitable algorithms to be used under limited computational resources. The objectives of the research include: developing and implementing object tracking based on the aHash, dHash, pHash, mHash, LHash, and LDHash algorithms; comparing the processing speed and accuracy of these methods on the video sequences "OccludedFace2," "David," and "Sylvester"; determining the tracking success rate (TSR) and frames per second (FPS) metrics for each algorithm; analyzing the impact of the search window size, search strategy, and type of hashing on tracking quality, and providing recommendations for their use. The study also explores the trade-off between accuracy and processing speed for each algorithm considering the constraints of limited computational resources. The methods of this study involve testing and evaluating the accuracy and speed of image hashing algorithms on different test video sequences, as well as the use of metrics to determine object similarity using the Hamming distance. The results demonstrate that the aHash and mHash algorithms demonstrate the best accuracy indicators for all hash window sizes, aHash has a higher processing speed, and mHash offers better robustness to changes in lighting and object position. The dHash and pHash algorithms were less effective than the aHash and mHash algorithms due to their sensitivity to changes in scale and rotation. However, perceptual hashing-based methods, such as pHash, are more robust to contrast and blurring. Conclusions. The best hashing algorithms for real-time object-tracking tasks are aHash and mHash. This study underscores the significance of selecting suitable hashing algorithms and search strategies tailored to specific application scenarios and offers possibilities for further optimization.

Keywords


visual object tracking; single object tracking; image hashing; perceptual hashing

Full Text:

PDF

References


Bao, C., Wu, Y., Ling, H., & Ji, H. Real time robust L1 tracker using accelerated proximal gradient approach. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 1830-1837. DOI: 10.1109/CVPR.2012.6247881.

Bai, S., Liu R.,, Su, Z., Zhang, C., & Jin, W. Incremental robust local dictionary learning for visual tracking. Proc (IEEE Int Conf Multimed Expo), 2014, vol. 2014, pp. 1-6. DOI: 10.1109/ICME.2014.6890262.

Jia, C., & et al. A Tracking-Learning-Detection (TLD) method with local binary pattern improved. 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO), 2015, pp. 1625-1630. DOI: 10.1109/ROBIO.2015.7419004.

Babenko, B., Yang, M.-H., & Belongie, S. Visual tracking with online Multiple Instance Learning. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 983-990. DOI: 10.1109/CVPR.2009.5206737.

Cho, J., Jin, S., Pham, X., Jeon, J., Byun, J., & Kang, H. A Real-Time Object Tracking System Using a Particle Filter. IEEE International Conference on Intelligent Robots and Systems, 2006, pp. 2822-2827. DOI: 10.1109/IROS.2006.282066.

Li, Y., & Zhu, J. A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration. Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science, Springer, Cham, 2015, vol. 8926, pp. 254-265. DOI: 10.1007/978-3-319-16181-5_18.

Danelljan, M., Häger, G., Khan, F., & Felsberg, M. Learning Spatially Regularized Correlation Filters for Visual Tracking. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 4310-4318. DOI: 10.1109/ICCV.2015.490.

Henriques, J., Caseiro, R., Martins, P., & Batista, J. High-Speed Tracking with Kernelized Correlation Filters. IEEE Trans Pattern Anal Mach Intell, 2014, vol. 37. DOI: 10.1109/TPAMI.2014.2345390.

Bolme, D., Beveridge, J., Draper, B., & Lui, Y. Visual object tracking using adaptive correlation filters. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 2544-2550. DOI: 10.1109/CVPR.2010.5539960.

Zhang, Z., & Peng, H. Deeper and Wider Siamese Networks for Real-Time Visual Tracking. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 4586-4595. DOI: 10.1109/CVPR.2019.00472.

Xu, Y., Wang, Z., Li, Z., Yuan, Y., & Yu, G. SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, pp. 12549-12556. DOI: 10.1609/aaai.v34i07.6944.

Bertinetto, L., Valmadre, J., Henriques, J., Vedaldi, A., & Torr, P. Fully-Convolutional Siamese Networks for Object Tracking. Computer Vision – ECCV 2016 Workshops. ECCV 2016. Lecture Notes in Computer Science, Springer, Cham, 2016, vol. 9914, pp. 850-865. DOI: 10.1007/978-3-319-48881-3_56.

Li, B., Yan, J., Wu, W., Zheng, Z., & Hu, X. High Performance Visual Tracking with Siamese Region Proposal Network. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 8971-8980. DOI: 10.1109/CVPR.2018.00935.

Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., & Yan, J. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019. DOI: 10.1109/CVPR.2019.00441.

Yan, B., Peng, H., Wu, K., Wang, D., Fu, J., & Lu, H. LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search. arXiv:2104.14545, 2021. DOI: 10.48550/arXiv.2104.14545.

Zhao, M., Okada, K., & Inaba, M. TrTr: Visual Tracking with Transformer. arXiv:2105.03817, 2021, DOI: 10.48550/arXiv.2105.03817.

Yan, B., Peng, H., Fu, J., Wang, D., & Lu, H. Learning Spatio-Temporal Transformer for Visual Tracking. arXiv:2103.17154, 2021. DOI: 10.48550/arXiv.2103.17154.

Evgeniou, T., & Pontil, M. Support Vector Machines: Theory and Applications. Machine Learning and Its Applications. ACAI 1999. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2001, vol. 2049, pp. 249-257. DOI: 10.1007/3-540-44673-7_12.

Yi, C. Target Tracking Feature Selection Algorithm Based on Adaboost. TELKOMNIKA Indonesian Journal of Electrical Engineering, 2014, vol. 12. Available at: https://ijeecs.iaescore.com/index.php/IJEECS/article/view/3056. (accessed Aug. 8 2024).

Hare, S., Saffari, A., & Torr, P. H. S. Struck: Structured output tracking with kernels. 2011 International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 263-270. DOI: 10.1109/ICCV.2011.6126251.

Kirillov, A., & et al. Segment Anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023, pp. 3992-4003. DOI: 10.1109/ICCV51070.2023.00371.

Ravi, N., & et al. SAM 2: Segment Anything in Images and Videos. arXiv.2408.00714, 2024. DOI: 10.48550/arXiv.2408.00714.

Fei, M., Li, J., & Liu, H. Visual tracking based on improved foreground detection and perceptual hashing. Neurocomputing, 2015, vol. 152, pp. 413-428. DOI: 10.1016/j.neucom.2014.09.060.

Fei, M., Ju, Z., Zhen, X., & Li, J. Real-time visual tracking based on improved perceptual hashing. Multimed Tools Appl, 2017, vol. 76, pp. 4617-4634. DOI: 10.1007/s11042-016-3723-5.

Chen, N., Xiao, H.-D., & Wan, W. Audio hash function based on non-negative matrix factorisation of mel-frequency cepstral coefficients. IET Information Security, 2011, vol. 5, iss. 1, pp. 19-25. DOI: 10.1049/iet-ifs.2010.0097.

Chen, N., & Xiao, H. Perceptual audio hashing algorithm based on Zernike moment and maximum-likelihood watermark detection. Digit Signal Process, 2013, vol. 23, iss. 4, pp. 1216-1227. DOI: 10.1016/j.dsp.2013.01.012.

Yang, B., Gu, F., & Niu, X. Block Mean Value Based Image Perceptual Hashing. 2006 International Conference on Intelligent Information Hiding and Multimedia, Pasadena, CA, USA, 2006, pp. 167-172. DOI: 10.1109/IIH-MSP.2006.265125.

Deng, Z., Xiao, H., Lang, Y., Feng, H., & Zhang, J. Multi-scale hash encoding based neural geometry representation. Comput Vis Media (Beijing), 2024, vol. 10, iss. 3, pp. 453-470. DOI: 10.1007/s41095-023-0340-x.

Xuan, Z., Wu, D., Zhang, W., Su, Q., Li, B., & Wang, W. Central similarity consistency hashing for asymmetric image retrieval. Comput Vis Media (Beijing), 2024, vol. 10, no. 4, pp. 725-740. DOI: 10.1007/s41095-024-0428-y.

Watson, A. Image Compression Using the Discrete Cosine Transform. Mathematica Journal, 1994, vol. 4, iss. 1, pp. 81-88. Available at: http://sites.apam.columbia.edu/courses/ap1601y/Watson_MathJour_94.pdf. (accessed Aug. 8 2024).

Fei, M., Li, J., Shao, L., Ju, Z., & Ouyang, G. Robust Visual Tracking Based on Improved Perceptual Hashing for Robot Vision. Intelligent Robotics and Applications. Lecture Notes in Computer Science, Springer, Cham, 2015, vol. 9246, pp. 331-340. DOI: 10.1007/978-3-319-22873-0_29.

Babenko, B., Yang, M.-H., & Belongie, S. Robust Object Tracking with Online Multiple Instance Learning. EEE Transactions on Pattern Analysis and Machine Intelligence, 2011, vol. 33, no. 8, pp. 1619-1632. DOI: 10.1109/TPAMI.2010.226.




DOI: https://doi.org/10.32620/reks.2025.1.09

Refbacks

  • There are currently no refbacks.