(48-2) 11 * << * >> * Русский * English * Содержание * Все выпуски
Comparative analysis of neural network models performance on low-power devices for a real-time object detection task
A. Zagitov 1, E. Chebotareva 1, A. Toschev 1, E. Magid 1,2
1 Institute of Information Technology and Intelligent Systems, Kazan Federal University,
420008, Kazan, Russian Federation, Kremlevskaya St. 35;
2 School of Electronic Engineering, Tikhonov Moscow Institute of Electronics and Mathematics, HSE University,
123592, Moscow, Russian Federation, Tallinskaya street 34
PDF, 5217 kB
DOI: 10.18287/2412-6179-CO-1343
Страницы: 242-252.
Язык статьи: English.
Аннотация:
A computer vision based real-time object detection on low-power devices is economically attractive, yet a technically challenging task. The paper presents results of benchmarks on popular deep neural network models, which are often used for this task. The results of experiments provide insights into trade-offs between accuracy, speed, and computational efficiency of MobileNetV2 SSD, CenterNet MobileNetV2 FPN, EfficientDet, YoloV5, YoloV7, YoloV7 Tiny and YoloV8 neural network models on Raspberry Pi 4B, Raspberry Pi 3B and NVIDIA Jetson Nano with TensorFlow Lite. We fine-tuned the models on our custom dataset prior to benchmarking and used post-training quantization (PTQ) and quantization-aware training (QAT) to optimize the models’ size and speed. The experiments demonstrated that an appropriate algorithm selection depends on task requirements. We recommend EfficientDet Lite 512×512 quantized or YoloV7 Tiny for tasks that require around 2 FPS, EfficientDet Lite 320×320 quantized or SSD Mobilenet V2 320×320 for tasks with over 10 FPS, and EfficientDet Lite 320×320 or YoloV5 320×320 with QAT for tasks with intermediate FPS requirements.
Ключевые слова:
computer vision, image analysis, object detection, deep learning, benchmarking, optimization techniques, edge devices.
Благодарности
This paper has been supported by the Kazan Federal University Strategic Academic Leadership Program ("PRIORITY-2030").
Citation:
Zagitov A, Chebotareva E, Toschev A, Magid E. Comparative analysis of neural network models performance on low-power devices for a real-time object detection task. Computer Optics 2024; 48 (2): 242-252. DOI: 10.18287/2412-6179-CO-1343.
References:
- Javaid M, Haleem A, Singh RP, Rab S, Suman R. Exploring impact and features of machine vision for progressive industry 4.0 culture. Sens Int 2022; 3: 100132.
- Nicholson L, Milford M, Sünderhauf N. QuadricSLAM: Dual quadrics from object detections as landmarks in object-oriented SLAM. IEEE Robot Autom Lett 2019; 4(1): 1-8.
- Motoda T, Petit D, Nishi T, Nagata K, Wan W, Harada K. Shelf replenishment based on object arrangement detection and collapse prediction for bimanual manipulation. Robotics 2022; 11(5): 104.
- Elhassouny A, Smarandache F. Trends in deep convolutional neural Networks architectures: a review. 2019 Int Conf of Computer Science and Renewable Energies (ICCSRE) 2019: 1-8.
- Branco S, Ferreira AG, Cabral J. Machine learning in resource-scarce embedded systems, FPGAs, and end-devices: A survey. Electronics 2019; 8(11): 1289.
- Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv Preprint. 2017. Source: <https://arxiv.org/abs/1704.04861>.
- Abadi M, et al. TensorFlow: A system for large-scale machine learning. In Book: Keeton K, Roscoe T, eds. Proceedings of the 12th USENIX conference on operating systems design and implementation, Savannah, GA, USA, 2016. Berkeley, CA: USENIX Association; 2016: 265-283.
- Tencent/ncnn. 2018. Source: <https://github.com/Tencent/ncnn>.
- Jiang X, et al. MNN: A universal and efficient inference engine. Proc 3rd MLSys Conf 2020; 2: 1-13.
- Myrzin V, Tsoy T, Bai Y, Svinin M, Magid E. Visual data processing framework for a skin-based human detection. In Book: Ronzhin A, Rigoll G, Meshcheryakov R, eds. Interactive collaborative robotics. 6th International Conference, ICR 2021. Cham, Switzerland: Springer Nature Switzerland AG; 2021: 138-149.
- Buyval A, Gavrilenkov M, Magid E. A multithreaded algorithm of UAV visual localization based on a 3D model of environment: implementation with CUDA technology and CNN filtering of minor importance objects. 2017 Int Conf on Artificial Life and Robotics (ICAROB 2017) 2017; 22: 356-359.
- Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. 2014 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2014: 580-587.
- Liu W, et al. SSD: Single shot multibox detector. In Book: Leibe B, Matas J, Sebe N, Welling M, eds. Computer Vision – ECCV 2016. Pt I. Cham, Switzerland: Springer International Publishing AG; 2016: 21-37.
- Huang J, et al. Speed/accuracy trade-offs for modern convolutional object detectors. Proc IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2017: 3296-3297.
- Li Y, Huang H, Xie Q, Yao L, Chen Q. Research on a surface defect detection algorithm based on MobileNet-SSD. Appl Sci 2018; 8(9): 1678.
- Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. MobileNetV2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR) 2018: 4510-4520.
- Zhang F, Li Q, Ren Y, Xu H, Song Y, Liu S. An expression recognition method on robots based on MobileNet V2-SSD. 2019 6th Int Conf on Systems and Informatics (ICSAI) 2019: 118-122.
- Ahmed I, Ahmad M, Ahmad A, Jeon G. IoT-based crowd monitoring system: Using SSD with transfer learning. Comput Electr Eng 2021; 93: 107226.
- Kamath V, Renuka A. Deep learning based object detection for resource constrained devices: Systematic review, future trends and challenges ahead. Neurocomputing 2023; 531: 34-60.
- Tan M, Pang R, Le QV. EfficientDet: Scalable and efficient object detection. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR) 2020: 10778-10787.
- Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2016: 779-788.
- Zhou X, Wang D, Krähenbühl P. Objects as points. arXiv Preprint. 2019. Source: <http://arxiv.org/abs/1904.07850>.
- Tan M, Le Q. EfficientNet: Rethinking model scaling for convolutional neural networks. Int Conf on Machine Learning (ICML) 2019: 6105-6114.
- Chollet F. Xception: Deep learning with depthwise separable convolutions. IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2017: 1251-1258.
- Nguyen H-H, Tran DN-N, Jeon JW. Towards real-time vehicle detection on edge devices with Nvidia Jetson TX2. 2020 IEEE Int Conf on Consumer Electronics – Asia (ICCE-Asia) 2020: 1-4.
- Song S, Jing J, Huang Y, Shi M. EfficientDet for fabric defect detection based on edge computing. J Eng Fibers Fabr 2021; 16: 1-13.
- Abdulganeev R, Lavrenov R, Safin R, Bai Y, Magid E. Door handle detection modelling for Servosila Engineer robot in Gazebo simulator. 2022 Int Siberian Conf on Control and Communications (SIBCON) 2022: 1-4.
- Lyu S, Li R, Zhao Y, Li Z, Fan R, Liu S. Green citrus detection and counting in orchards based on YOLOv5-CS and AI edge system. Sensors 2022; 22(2): 576.
- ultralytics/yolov5. 2020. Source: <https://github.com/ultralytics/yolov5>.
- Wang C-Y, Bochkovskiy A, Liao H-YM. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv Preprint. 2022. Source: <https://arxiv.org/abs/2207.02696>.
- ultralytics/ultralytics. 2023. Source: <https://github.com/ultralytics/ultralytics>.
- Gillani IS, et al. Yolov5, Yolo-x, Yolo-r, Yolov7 Performance comparison: A survey. 8th Int Conf on Artificial Intelligence and Fuzzy Logic System (AIFZ 2022) 2022. DOI: 10.5121/csit.2022.121602.
- Nguyen H-V, Bae J-H, Lee Y-E, Lee H-S, Kwon K-R. Comparison of pre-trained YOLO models on steel surface defects detector based on transfer learning with GPU-based embedded devices. Sensors 2022; 22(24): 9926.
- Xia H, Yang B, Li Y, Wang B. An improved CenterNet model for insulator defect detection using aerial imagery. Sensors 2022; 22(8): 2850.
- Jacob B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR) 2018: 2704-2713.
- Wu H, Judd P, Zhang X, Isaev M, Micikevicius P. Integer quantization for deep learning inference: Principles and empirical evaluation. arXiv Preprint. 2020. Source: <http://arxiv.org/abs/2004.09602>.
- Cantero D, Esnaola-Gonzalez I, Miguel-Alonso J, Jauregi E. Benchmarking object detection deep learning models in embedded devices. Sensors 2022; 22(11): 4205.
- Lin T-Y, et al. Microsoft COCO: Common objects in context. In Book: Fleet D, Pajdla T, Schiele B, Tuytelaars T, eds. Computer Vision--ECCV 2014. Pt V. Cham, Switzerland: Springer International Publishing Switzerland; 2014: 740-755.
- Paszke A, et al. PyTorch: An imperative style, high-performance deep learning library. NIPS'19: Proc 33rd Int Conf on Neural Information Processing Systems 2019: 8024-8035.
- Han H, Siebert J. TinyML: A systematic review and synthesis of existing research. 2022 Int Conf on Artificial Intelligence in Information and Communication (ICAIIC) 2022: 269-274.
- Kurtz M, et al. Inducing and exploiting activation sparsity for fast inference on deep neural networks. Int Conf on Machine Learning 2020: 5533-5543.
- Kamath V, A R. Performance analysis of the pretrained EfficientDet for real-time object detection on Raspberry Pi. 2021 Int Conf on Circuits, Controls and Communications (CCUBE) 2021: 1-6.
- Konaite M, Owolawi PA, Mapayi T, Malele V, Odeyemi K, Aiyetoro G, Ojo JS. Smart hat for the blind with real-time object detection using Raspberry Pi and TensorFlow lite. Proc Int Conf on Artificial Intelligence and Its Applications (icARTi '21) 2021: 6.
© 2009, IPSI RAS
Россия, 443001, Самара, ул. Молодогвардейская, 151; электронная почта: journal@computeroptics.ru; тел: +7 (846) 242-41-24 (ответственный секретарь), +7 (846) 332-56-22 (технический редактор), факс: +7 (846) 332-56-20