Study of methods for searching and localizing objects in images from aircraft using convolutional neural networks

Rostyslav Tsekhmystro, Oleksii Rubel, Vladimir Lukin

Abstract


The use of unmanned and manned aerial vehicles for remote object localization and classification is very common. These methods are used in various systems, ranging from territory surveys to law enforcement. Methods of object localization and classification using neural networks require a detailed study and research of the quality of their work on data that has certain specifics, such as vehicle detection. The use of neural networks to detect certain types of objects using images obtained from aircraft can also help in the study of hard-to-reach locations. Therefore, the main subject of this paper is the localization and classification of objects in images obtained using digital cameras mounted on aircraft. The main focus is on determining the accuracy of object localization and detection using selected types of neural networks, which are the most important indicators of neural network efficiency. The speed of a neural network is also an equally important characteristic as it directly affects its ability to be used in tasks that require fast object localization, such as video surveillance or automated car control systems. The main goal of this study is to study the accuracy of object localization and classification in images obtained with the help of cameras mounted on aircraft, as well as to study the speed of neural networks and determine the effectiveness of their application in real-world conditions. The objectives of this study are to train YOLO v5, SSD, and Faster RCNNs on the VisDrone dataset and to further study them on the vehicle localization dataset. The main goal of this work is to obtain statistics on the performance of neural networks trained on the VisDrone dataset. On the basis of the obtained statistics, conclusions are drawn about the effectiveness of the considered neural networks. The conclusions are drawn by considering the speed of the model, localization (IoU), and classification (Precision, Recall) metrics. Possible directions for further development of the topic under study are presented as conclusions.

Keywords


object localization; YOLOv5s; SSD; FasterRCNN; vehicle classification; aircraft

Full Text:

PDF

References


Wang, L., Tang, J., & Liao, Q. A Study on Radar Target Detection Based on Deep Neural Networks. IEEE Sensors Letters, 2019, vol. 3, no. 3, article no. 7000504. DOI: 10.1109/LSENS.2019.2896072.

Yu, S. Sonar Image Target Detection Based on Deep Learning. Mathematical Problems in Engineering, 2022, vol. 2022, article no. 5294151. DOI: 10.1155/2022/5294151.

Bondžulić, B., Stojanović, N., Lukin, V., Stankevich, S. A., Bujaković, D., & Kryvenko, S. Target acquisition performance in the presence of JPEG image compression. Defence Technology, 2023. DOI: 10.1016/j.dt.2023.12.006.

Zhao, M., Li, W., Li, L., Hu, J., Ma, P., & Tao, R. Single-Frame Infrared Small-Target Detection: A survey. IEEE Geoscience and Remote Sensing Magazine, 2022, vol. 10, no. 2, pp. 87-119. DOI: 10.1109/MGRS.2022.3145502.

Lei, J., Lay, T., Weiland, C., & Lu, C. Combination of Spatiotemporal ICA and Euclidean Features for Face Recognition. Artificial Intelligence in Theory and Practice. IFIP AI 2006. IFIP International Federation for Information Processing, 2006, vol. 217, pp. 395-403. DOI: 10.1007/978-0-387-34747-9_41.

Ford Blue Cruise Version 1.2 Hands-Off Review: More Automation, Improved Operation. Available at: https://www.motortrend.com/reviews/ford-bluecruise-version-1-2-first-drive-review/. (accessed 5 Jan. 2024).

Cao, Z., Kooistra, L., Wang, W., Guo, L. & Valente, J. Real-Time Object Detection Based on UAV Remote Sensing: A Systematic Literature Review. Drones, 2023, no. 7, article no. 620. DOI: 10.3390/drones7100620.

Feng, J. & Yi, C. Lightweight Detection Network for Arbitrary-Oriented Vehicles in UAV Imagery via Global Attentive Relation and Multi-Path Fusion. Drone, 2022, vol. 6, no. 5, article no. 108. DOI: 10.3390/drones6050108.

Alsamhi, S. H., Shvetsov, A. V., Kumar, S., Shvetsova, S. V., Alhartomi, M. A., Hawbani, A., Rajput, N. S., Srivastava, S., Saif, A., & Nyangaresi, V. O. UAV Computing-Assisted Search and Rescue Mission Framework for Disaster and Harsh Environment Mitigation, Drones, 2022, vol. 6, no. 7, article no. 154. DOI: 10.3390/drones6070154.

Aposporis, P. Object Detection Methods for Improving UAV Autonomy and Remote Sensing Applications. 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020, pp. 845-853. DOI: 10.1109/ASONAM49781.2020.9381377.

Zhao, C., Liu, R.W., Qu, J., & Gao, R. Deep Learning-Based Object Detection in Maritime Unmanned Aerial Vehicle Imagery: Review and Experimental Comparisons. Engineering Applications of Artificial Intelligence, 2024, vol. 128, article no. 107513. DOI: 10.1016/j.engappai.2023.107513.

Kong, M., Roh, M., Kim, Lee, J., Kim, J., & Lee, G. Object detection method for ship safety plans using deep learning. Ocean Engineering, 2022, vol. 246, article no. 110587. DOI: 10.1016/j.oceaneng.2022.110587.

Lyu, M., Zhao, Y., Huang, C., & Huang H. Unmanned Aerial Vehicles for Search and Rescue: A Survey. Remote Sensing, 2023, no. 15, article no. 3266. DOI: 10.3390/rs15133266.

Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., & Savarese, S. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 658-666. DOI: 10.48550/arXiv.1902.09630.

Ting, K. M. Precision and Recall. Encyclopedia of Machine Learning, 2010. 781 p. DOI: 10.1007/978-0-387-30164-8_652.

PYTORCH DOCUMENTATION. Available at: https://pytorch.org/docs/stable/index.html#pytorch-documentation. (accessed 5 Jan. 2024).

Zhu, P., Wen, L., Du, D., Bian, X., Fan, H., Hu, Q., & Ling, H. Detection and Tracking Meet Drones Challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, vol. 44, no. 11, pp. 7380-7399. DOI: 10.1109/TPAMI.2021.3119563.

Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., & Tian, Q. The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking. European Conference on Computer Vision (ECCV), 2018, vol. 128, pp. 1141-1159. DOI: 10.1007/s11263-019-01266-1.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., & Berg, A. SSD: Single Shot MultiBox Detector. Lecture Notes in Computer Science, 2016, vol. 9905, pp. 21-37. DOI: 10.1007/978-3-319-46448-0_2

YOLOv5 by Ultralytics. Available at: https://github.com/ultralytics/yolov5. (accessed 5 Jan. 2024).

Shaoqing, R., Kaiming, H., Girshick, R., & Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, vol. 39, pp. 1137-1149. DOI: 10.1109/TPAMI.2016.2577031.

Simonyan, K., & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. 3rd International Conference on Learning Representations (ICLR 2015), 2015, pp. 1-14. DOI: 10.48550/arXiv.1409.1556.

He, K., Zhang, X., Ren, S., & Sun, J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778. DOI: 10.1109/CVPR.2016.90.

Tan, M., & Le, Q. V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, 2019, pp. 6105-6114. DOI: 10.48550/arXiv.1905.11946.

SMOOTHL1LOSS. Available at: https://pytorch.org/docs/stable/generated/torch.nn.SmoothL1Loss.html. (accessed 5 Jan. 2024).

CROSSENTROPYLOSS. Available at: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html. (accessed 5 Jan. 2024).

Sutskever, I., Martens, J., Dahl, G. & Hinton, G. On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, 2013, pp. 1139-1147.

Howard, A., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V. & Le, Q. Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314-1324. DOI: 10.1109/ICCV.2019.00140.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, pp. 142-158. DOI: 10.1109/TPAMI.2015.2437384.

Girshick, R. Fast R-CNN. 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440-1448. DOI: 10.1109/ICCV.2015.169.

Redmon, J. YOLOv3: An Incremental Improvement, 2018. DOI: 10.48550/arXiv.1804.02767. (unpablished).

Kingma, D. P., & Ba, J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations, 2014. DOI: 10.48550/arXiv.1412.6980.

Rubel, A., Rubel, O., Tsekhmystro, R., Rebrov, V., & Lukin V. Automatic Decision Undertaking on Expedience of Image Denoising Based on Filter Efficiency Prediction. Proceedings of ISSOIA Conference, 2022, pp. 504-524. DOI: 10.1007/978-981-99-4098-1_44.

Tsymbal, O. V., Lukin, V. V., Ponomarenko, N. N., Zelensky, A. A., Egiazarian, K. O., & Astola, J. T. Three-state Locally Adaptive Texture Preserving Filter for Radar and Optical Image Processing. EURASIP Journal on Applied Signal Processing, 2005, no. 8, pp. 1185-1204. DOI: 10.1155/ASP.2005.1185.

Proskura, G. A., Rubel, O. S., & Lukin, V. V. On classifier learning methodologies with application to compressed remote sensing images. Radioelectronic and Computer Systems, 2022, no. 3, pp. 174-189. DOI: 10.32620/reks.2022.3.13.




DOI: https://doi.org/10.32620/reks.2024.1.08

Refbacks

  • There are currently no refbacks.