基于优化RetinaNet的自适应特征融合的轻量化目标检测方法

doi:10.3969/j.issn.1000-1158.2024.11.10

摘要
图/表
参考文献(23)
相关文章 (15)

全文: PDF (450 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对目标检测算法中计算量大、模型复杂等问题,造成在嵌入式平台计算资源有限的应用场景下难以部署的现象,提出一种基于优化RetinaNet的自适应特征融合的轻量化目标检测方法。提出的算法参考了GhostNet中的Ghost Module模块以减少模型参数量。通过一种空间特征融合机制,提高特征的尺度不变性。融合了结构重参化的思想,增加训练深度,实现多分支训练,单分支推理,更好地提升模型的推理性能。提出的方法在PASCAL VOC2007和COCO两种常用的目标检测数据集上进行评估,平均精度为54.1%,优于RetinaNet的平均精度,实验结果表明,提出的方法推理时所占的内存为170.71MByte,是RetinaNet所占内存的44.27%,表明提出的算法在保证精度的前提下极大提高网络的推理速度。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	张立国
	季鑫烨
	章玉鹏
	耿星硕
	张升

关键词 ：机器视觉, 目标检测, 优化RetinaNet;特征融合, 轻量化, 结构重参化

Abstract：Aiming at the problems of large amount of computation and complex model in target detection algorithm, which make it difficult to deploy in application scenarios with limited computing resources on embedded platform. The lightweight target detection method based on optimized RetinaNet is proposed with adaptive feature fusion. Firstly, the proposed algorithm refers to the Ghost Module in GhostNet to reduce the number of model parameters. By means of a spatial feature fusion mechanism, the scale invariance of features is improved. Secondly, the idea of structural reparameterization is integrated to increase the depth of training, realize multi-branch training, single-branch training, and better improve the detecting performance of the model. The method is evaluated on two common target detection datasets, PASCAL VOC2007 and COCO. With an average accuracy of 54.1%, better than that of RetinaNet. The experimental results show that the memory taken by the proposed method is 170.71MByte, which is 44.27% of the memory taken by the RetinaNet, indicating that the proposed algorithm can greatly improve the network inference speed without ensuring the accuracy.

Key words： machine vision target detection optimized RetinaNet feature fusion lightweight structure reparameterization

收稿日期: 2023-03-28 发布日期: 2024-11-29

PACS:

TB96

基金资助:国家重点研发计划(2020YFB1711000);河北省科学技术研究与发展计划科技支撑计划项目(20310302D);河北省中央引导地方专项(199477141G)

作者简介: 张立国(1978-), 男, 河北秦皇岛人,燕山大学副教授, 主要从事机器视觉、故障诊断、虚拟现实方面的研究。Email:zlgtime@163.com

引用本文:

张立国, 季鑫烨, 章玉鹏, 耿星硕, 张升. 基于优化RetinaNet的自适应特征融合的轻量化目标检测方法[J]. 计量学报, 2024, 45(11): 1665-1670.
ZHANG Liguo, JI Xinye, ZHANG Yupeng, GENG Xingshuo, ZHANG Sheng. Lightweight Target Detection Method Based on Adaptive Feature Fusion of Optimized RetinaNet. Acta Metrologica Sinica, 2024, 45(11): 1665-1670.

链接本文:

http://jlxb.china-csm.org:81/Jwk_jlxb/CN/10.3969/j.issn.1000-1158.2024.11.10 或 http://jlxb.china-csm.org:81/Jwk_jlxb/CN/Y2024/V45/I11/1665

［2］	DING X, ZHANG X, MA N, et al. Repvgg: Making vgg-style convnets great again［C］//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, 2021: 13733-13742.
［3］	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection［EB/OL］. (2020-04-23). ［2023-02-15］. https://arxiv.org/abs/2004.10934.
［11］	LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector［C］//European conference on computer vision. Amsterdam, Netherlands, 2016: 21-37.
［14］	MISRA D. Mish: A self regularized non-monotonic neural activation function［EB/OL］. (2020-08-05). ［2023-02-18］. https://arxiv.org/abs/1908.08681.
［16］	姚波, 温秀兰, 焦良葆, 等. 改进YOLOv3算法用于铝型材表面缺陷检测［J］. 计量学报, 2022, 43(10): 1256-1261.
［5］	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks［J］. Communications of the ACM, 2017, 60(6): 84-90.
［7］	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks［C］//Proceedings of the IEEE conference on computer vision and pattern recognition. Honolulu, USA, 2017: 4700-4708.
［8］	GIRSHICK R. Fast R-CNN［C］//Proceedings of the IEEE international conference on computer vision. Boston, USA, 2015: 1440-1448.
［12］	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection［C］//Proceedings of the IEEE international conference on computer vision. Honolulu, USA, 2017: 2980-2988.
	YAO B, WEN X L, JIAO L B, et al. Improved YOLOv3 Algorithm for Surface Defect Detection of Aluminum Profile［J］. Acta Metrologica Sinica, 2022, 43(10): 1256-1261.
	WANG Z W, GUO B, HU X F, et al. Internal Groove Defect Detection Method of Brake Master Cylinder Based on FCOS Neural Network.［J］ Acta Metrologica Sinica, 2021, 42(9): 1225-1231.
［19］	QI L, KUEN J, GU J, et al. Multiscale aligned distillation for low-resolution detection［C］//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, 2021: 14443-14453.
［20］	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context［C］//European conference on computer vision. Springer, Cham, 2014.
［22］	WU Y, LIM J, YANG M H. Online object tracking: A benchmark［C］//IEEE conference on computer vision and pattern recognition. Portland, USA, 2013.
［6］	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition［C］//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA, 2016: 770-778.
［15］	LIU S, HUANG D. Receptive field block net for accurate and fast object detection［C］//Proceedings of the European conference on computer vision. Munich, Germany, 2018: 385-400.
［23］	DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database［C］//IEEE conference on computer vision and pattern recognition. Miami, USA, 2009.
［1］	LI C Y, LI L L, JIANG H L, et al. YOLOv6: a single-stage object detection framework for industrial applications［EB/OL］. (2022-09-07) ［2023-01-08］. https://arxiv.org/abs/2209.02976.
［4］	HAN K, WANG Y, TIAN Q, et al. Ghostnet: More features from cheap operations［C］//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Seattle, USA, 2020: 1580-1589.
［10］	REDMON J, FARHADI A. Yolov3: An incremental improvement［EB/OL］. (2018-04-08). ［2022-12-18］. https://arxiv.org/abs/1804.02767.
［13］	GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks［C］//Proceedings of the fourteenth international conference on artificial intelligence and statistics. Crete, Greece, 2011: 315-323.
［18］	CHEN Q, WANG Y M, YANG T, et al. You only look one-level feature［C］//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Nashville, TN, USA, 2021: 13039-13048.
［21］	RUSSAKOVSKY O, DENG J, SU H, et al. Imagenet large scale visual recognition challenge［J］. International journal of computer vision, 2015, 115(3): 211-252.
［9］	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks［J］. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.
［17］	王芷薇, 郭斌, 胡晓峰, 等. 基于FCOS神经网络的制动主缸内槽缺陷检测方法［J］. 计量学报, 2021, 42(9): 1225-1231.