Evaluation of Deep Learning YOLOv3 Algorithm for Object Detection and Classification

Document Type : Original Article

Authors

1 Dept. of Computer Science and Engineering, Faculty of Electronic Engineering Menoufia University.

2 Dept. of Computer Science and Engineering, Faculty of Electronic EngineeringMenoufia University email

Abstract

You Only Look Once version 3 (YOLOv3) is a deep learning model for object detection and classification. It is a single neural network architecture model that uses features from the feeding images and predicts bounding box for all classes of image simultaneously. This paper descript an experimental work for train the deep learning model based on YOLOv3 architecture implemented using Tensor Flow as a deep learning framework. The training process had been done using the data-set PASCAL VOC 2007 and data-set PASCAL VOC 2012 and using The Adaptive Moment Estimation Optimizer (ADM optimizer). The trained model is then tested by using the VOC 2007 test data-set. The final results evaluate the YOLOv3 deep learning model performance for object detection and classification.

Keywords


[1] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[2] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[3] “YOLOv3: An Incremental Improvement - pjreddie.com.” [Online]. Available: https://pjreddie.com/media/files/papers/YOLOv3.pdf. [Accessed: 24-Sep-2019].
[4] “CS231n: Convolutional Neural Networks for Visual Recognition,” Stanford University CS231n: Convolutional Neural Networks for Visual Recognition. [Online]. Available: http://cs231n.stanford.edu/. [Accessed: 25-Sep-2019].
[5] W. A. Ezat, M. M. Dessouky, and N. A. Ismail, “Multi-class Image Classification Using Deep Learning Algorithm,” Journal of Physics: Conference Series, vol. 1447, p. 012021, 2020.
[6] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-Based Convolutional Networks for Accurate Object Detection and Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142–158, Jan. 2016.
[7] R. Girshick, “Fast R-CNN,” 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
[8] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, Jan. 2017.
[9] “The PASCAL Visual Object Classes Challenge 2007,” The PASCAL Visual Object Classes Challenge 2007 (VOC2007). [Online]. Available: http://host.robots.ox.ac.uk/pascal/VOC/voc2007/. [Accessed: 25-Sep-2019].
[10] The PASCAL Visual Object Classes Challenge 2012 (VOC2012)
host.robots.ox.ac.uk/Pascal/VOC/voc2012.
[11] “Gentle Introduction to the Adam Optimization Algorithm for Deep Learning,” Machine Learning Mastery, 06-Aug-2019. [Online]. Available: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/. [Accessed: 25-Sep-2019].
[12] D. P. Kingma and J. L. Ba∗, “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION,” arXiv, Jan. 2017.
[13] “First Steps with Tensor Flow: Toolkit | Machine Learning Crash Course,” Google. [Online]. Available: https://developers.google.com/machine-learning/crash-course/first-steps-with-tensorflow/toolkit. [Accessed: 28-Sep-2019].