В’ячеслав Васильович Москаленко, Микола Олександрович Зарецький, Альона Сергіївна Москаленко


The classification model which consists of the motion detector, object tracker, convolutional sparse coded feature extractor and stacked information-extreme classifier is developed. It is proposed to build a motion detector based on the difference of consecutive aligned frames where alignment is performed via keypoints matching, homography estimation, and projective transformations. Motion detector seeks to simplify object classification task through reduction of input data variations and resource savings for motion region search model synthesis without training. The proposed model is characterized by low computational complexity and it can be used as labeling dataset gathering tool for deep moveable object detector. Furthermore, the training method for moving object detector is developed. The method consisting in unsupervised pretraining feature extractor based on sparse coding neural gas, supervised pretraining and following fine-tuning of stacked information-extreme classifier. Using soft-competitive learning scheme in sparse coding neural gas facilitates robust convergence to close to optimal distributions of the neurons over the data. Sparse coding neural gas reduces the requirements for the volume of labeled observations and computational resource. As a criterion for the effectiveness of classifier's machine training, the normalized modification of S. Kullback’s information measure is considered. Labeling new emerging data through self-labeling for high prediction score cases and manual labeling for low prediction score cases, and following labeled object tracking are also offered. In this case, class balancing using undersampling within dichotomous strategy “one-against-all”. The set of classes include bicycle, bus, car, motorcycle, pickup truck, articulated truck, and background. Simulation results on MIO-TCD dataset confirm the suitability of the proposed model and training method for practical usage. 


classification; motion detection; object detection; convolutional neural network; sparse coding neural gas; information-extreme learning; active learning; self-taught learning


Luo, C., Nightingale, J., Asemota, E., Grecos, C. A UAV-Cloud System for Disaster Sensing Applications, 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), Glasgow, 2015, pp. 1–5. DOI: 10.1109/VTCSpring.2015.7145656.

Wang, J., Feng, Z., Chen, Z., George, S., Bala, M., Pillai, P., Yang, S., Satyanarayanan, M. Bandwidth-Efficient Live Video Analytics for Drones Via Edge Computing, 2018 IEEE/ACM Symposium on Edge Computing (SEC), Bellevue, WA, 2018, pp. 159–173. DOI:10.1109/SEC.2018.00019.

Savitha, C., Ramesh, D. Motion detection in video surviellance: A systematic survey, 2018 2nd International Conference on Inventive Systems and Control (ICISC), Coimbatore, 2018, pp. 51–54. DOI: 10.1109/ICISC.2018.8398880.

Wu-ChihHu, Chao-HoChen, Tsong-YiChen, Deng-YuanHuang, Zong-CheWu, Moving object detection and tracking from video captured by moving camera Journal of Visual Communication and Image Representation, Elsevier, vol. 30, pp. 164–180. DOI:10.1016/j.jvcir.2015.03.003.

Kim, S. W., Yun, K., Yi, K. M., Kim, S. J., Choi, J. Y. Detection of moving objects with a moving camera using non-panoramic background model, Machine Vision and Applications, 2013, vol. 24, iss. 5, pp 1015–1028. DOI:10.1007/s00138-012-0448-y.

Logoglu, K., Lezki, H., Yucel, M. Feature-based efficient moving object detection for low-altitude aerial platforms, 2017 IEEE International Conference on Computer Vision Workshop (ICCVW), Venice, Italy, 2017, pp. 2119–2128. DOI: 10.1109/ICCVW.2017.248.

Okafor, E., Pawara, P. Comparative study between deep learning and bag of visual words for wild-animal recognition, 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, 2016, pp. 1–8. DOI: 10.1109/SSCI.2016.7850111.

Lemley, J., Bazrafkan, S., Corcoran, P. Smart augmentation learning an optimal data augmentation strategy, IEEE Access, 2017, vol. 5, pp. 5858–5869. DOI: 10.1109/ACCESS.2017.2696121.

Labusch, K., Barth, E., Martinetz, T. Sparse coding neural gas: learning of overcomplete data representations, Neurocomputing, 2009, vol. 72, iss. 7–9, pp. 1547–1555. DOI:10.1016/j.neucom.2008.11.027.

Ayumi, V., Rere, L. M. R., Fanany, M. I., Arymurthy, A. M. Optimization of convolutional neural network using microcanonical annealing algorithm, 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Malang, 2016, pp. 506 – 511. DOI: 10.1109/ICACSIS.2016.7872787.

Moskalenko, V., Moskalenko, A., Korobov, A., Semashko, V. The model and training algorithm of compact drone autonomous visual navigation system, Data, 2019, vol. 4, iss. 1, DOI: 10.3390/data4010004.

Moskalenko, V., Dovbysh, S., Naumenko, I., Moskalenko, A., Korobov, A. Improving the effectiveness of training the on-board object detection system for a compact unmanned aerial vehicle, Eastern-European Journal of Enterprise Technologies, 2018, vol. 4, no.9 (94), pp. 19–26. DOI: 10.15587/1729-4061.2018.139923.

Montoya-Catalá, M., Alvear-Sandoval, R. F., Figueiras-Vidal, A. R. Experiments in combining boosting and deep stacked networks, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), Vietri sul Mare, 2016, pp. 1–6. DOI: 10.1109/MLSP.2016.7738874.

Tareen, S. A. K., Saleem, Z. A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK, 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), Sukkur, 2018, pp. 1–10. DOI: 10.1109/ICOMET.2018.8346440.

Raid, A. M., Khedr, W., El-dosuky, M., Aoud, M. Image restoration based on morphological operations. International Journal of Computer Science, Engineering and Information Technology, vol. 4, pp. 9–21, 2014. DOI: 10.5121/ijcseit.2014.4302.

Xu, T., Huang, C., He, Q., Guan, G., Zhang, Y. An improved TLD target tracking algorithm, 2016 IEEE International Conference on Information and Automation (ICIA), Ningbo, 2016, pp. 2051–2055. DOI: 10.1109/ICInfA.2016.7832157.

Wang, D.-w., Ma, X., Su, Y. Undercomplete dictionary-based feature extraction for radar target identification, Progress in Electromagnetics Research M, vol. 1, 2008, pp. 1–19. DOI: 10.2528/PIERM08012805.

Luo Z. MIO-TCD: A new benchmark dataset for vehicle classification and localization, in IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 5129–5141, Oct. 2018. DOI: 10.1109/TIP.2018.2848705.

Oh, S., Hoogs, A., Amitha Perera, A. G., Cuntoor, N. A large-scale benchmark dataset for event recognition in surveillance video, 2011 Computer Vision and Pattern Recognition (CVPR), IEEE, 2011, pp. 3153–3160. DOI:10.1109/CVPR.2011.5995586.

MIO-TCD dataset. Available at: (accessed 24.05.2019).

VIRAT Video Dataset. Available at: (accessed 24.05.2019).



  • There are currently no refbacks.