Research on machine learning methods for detecting objects in difficult shooting conditions

Vitalii Serdechnyi, Olesia Barkovska, Andriy Kovalenko, Anton Havrashenko, Vitalii Martovytskyi

Abstract


The subject matter of the article is research into machine learning methods for object detection in images and videos under complex urban conditions, particularly under poor lighting, the presence of precipitation, high scene complexity, and limited computational resources. The goal of this research is to identify the most effective deep learning models based on convolutional neural networks for object detection tasks under challenging imaging conditions, considering the practical requirements for accuracy and processing speed. The tasks to be solved are: analysis of object detectors (YOLO v8–11, DETR, SSD, Mask R-CNN, Faster R-CNN, RetinaNet); preparation of a dataset with real weather conditions and pedestrian environments in Ukraine; experimental evaluation of selected detectors using the metrics mAP@0.5, mAP@.5:.95, Recall, Precision, IoU, FPS, and F1-Score; and analysis of the obtained results. The methods used are: convolutional neural networks, automated image annotation, comparative analysis of quality metrics (F1-score, mAP@0.5:.95, Precision, Recall, IoU, FPS), and manual correction of annotations. The following results were obtained: the YOLOv10-m and YOLOv11-m models demonstrated the best quality indicators under conditions of limited visibility and varying lighting. The YOLOv11-m model was the most balanced in terms of accuracy and speed across all tested conditions - snow, rain, and sunshine. YOLOv11-m is recommended as the baseline model for implementation in real-time systems, particularly in intelligent assistants for people with visual impairments. Conclusions: The scientific novelty of the results obtained is as follows: 1) a comprehensive evaluation of modern deep learning architectures for object detection (YOLOv8–v11, Faster R-CNN, SSD, Mask R-CNN, DETR, RetinaNet) was carried out under non-laboratory conditions, including real weather scenarios such as snow, rain, and poor lighting, which are typical for urban environments in Eastern Europe; 2) the software tool for automated model evaluation was developed, allowing simultaneous testing of multiple architectures and visualization of performance metrics (F1-score, mAP@0.5, mAP@.5:.95, IoU, Precision, Recall, FPS) with support for manual annotation correction and comparative model analysis; 3) it was experimentally established that the YOLOv11-m model demonstrates the best balance of accuracy and inference speed across various complex imaging conditions, justifying its recommendation as a baseline model for real-time vision-based assistive systems.

Keywords


method; detection; image; object; video; YOLO; weather conditions; model

Full Text:

PDF

References


Zhang, Y., Carballo, A., Yang, H., & Takeda, K. Perception and sensing for autonomous vehicles under adverse weather conditions: A survey. ISPRS Journal of Photogrammetry and Remote Sensing, 2023, pp. 146-177. DOI: 10.1016/j.isprsjprs.2022.12.021.

Xiao, H., Zhang, F., Shen, Z., Wu, K., & Zhang, J. Classification of weather phenomenon from images by using deep convolutional neural network. Earth and Space Science, 2021, vol. 8, iss. 5, article no. e2020EA001604. DOI: 10.1029/2020EA001604.

Liu, W., Ren, G., Yu, R., Guo, S., Zhu, J., & Zhang, L. Image-adaptive YOLO for object detection in adverse weather conditions. In Proceedings of the AAAI conference on artificial intelligence, 2022, vol. 36, no. 2, pp. 1792-1800. DOI: 10.48550/arXiv.2112.08088.

Mimouna, A., Alouani, I., Ben Khalifa, A., El Hillali, Y., Taleb-Ahmed, A., Menhaj, A., Ouahabi, A., & Ben Amara, N. E. OLIMP: A Heterogeneous Multimodal Dataset for Advanced Environment Perception. Electronics 2020, vol. 9, article no. 560. DOI: 10.3390/electronics9040560.

Heredia-Aguado, E., Cabrera, J. J., Jiménez, L. M., Valiente, D., & Gil, A. Static Early Fusion Techniques for Visible and Thermal Images to Enhance Convolutional Neural Network Detection: A Performance Analysis. Remote Sens. 2025, vol. 17, article no. 1060. DOI: 10.3390/rs17061060.

Balestra, M., Marselis, S., Sankey, T. T., Cabo, C., Liang, X., Mokroš, M., & et al. LiDAR data fusion to improve forest attribute estimates: A review. Current Forestry Reports, 2024, vol. 10, pp. 281-297. DOI: 10.1007/s40725-024-00223-7.

Al-Haija, Q. A., Smadi, M. A., & Zein-Sabatto, S. Multi-Class Weather Classification Using ResNet-18 CNN for Autonomous IoT and CPS Applications. 2020 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 2020, pp. 1586-1591. DOI: 10.1109/CSCI51800.2020.00293.

Rajib, G. On-road vehicle detection in varying weather conditions using faster R-CNN with several region proposal networks. Multimedia Tools Appl, 2021, vol. 80, pp. 25985–25999. DOI: 10.1007/s11042-021-10954-5.

Osipov, A., Pleshakova, E., Gataullin, S., Korchagin, S., Ivanov, M., Finogeev, A., & Yadav, V. Deep Learning Method for Recognition and Classification of Images from Video Recorders in Difficult Weather Conditions. Sustainability, 2022, vol. 14, article no. 2420. DOI: 10.3390/su14042420.

Tumas, P., Nowosielski A., & Serackis, A. Pedestrian Detection in Severe Weather Conditions, in IEEE Access, 2020, vol. 8, pp. 62775-62784, DOI: 10.1109/ACCESS.2020.2982539.

Tan, L., Huangfu, T., Wu, L., & Chen, W. Comparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification. BMC Med Inform Decis Mak, 2021, vol. 21, article no. 324, DOI: 10.1186/s12911-021-01691-8.

Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, vol. 39, no. 6, pp. 1137-1149, DOI: 10.1109/TPAMI.2016.2577031.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. SSD: Single Shot MultiBox Detector. In Computer Vision–ECCV 2016: 14th European Conference, 2016, vol. 9905, pp. 21-37, DOI: 10.1007/978-3-319-46448-0_2.

Bharati, P., & Pramanik, A. Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey. Computational Intelligence in Pattern Recognition: Proceedings of CIPR 2019, pp. 657-668. DOI: 10.1007/978-981-13-9042-5_56.

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko. S. End-to-End Object Detection with Transformers. In European conference on computer vision ECCV 2020, 2020, vol. 12346, pp. 213-229, DOI: 10.1007/978-3-030-58452-8_13.

Barkovska, O., & Serdechnyi, V. Intelligent assistance system for people with visual impairments. Innovative technologies and scientific solutions for industries, 2024, no. 2(28), pp. 6–16, DOI: 10.30837/2522-9818.2024.28.006.

Wu, J., Ye, Y., & Du, J. Multi-objective reinforcement learning for autonomous drone navigation in urban areas with wind zones. Automation in construction, 2024, vol. 158, article no. 105253. DOI: 10.1016/j.autcon.2023.105253.




DOI: https://doi.org/10.32620/reks.2025.2.04

Refbacks

  • There are currently no refbacks.