汽车工程 ›› 2025, Vol. 47 ›› Issue (10): 1895-1904.doi: 10.19562/j.chinasae.qcgc.2025.10.005

• • 上一篇    

基于图像显著性特征融合的弱势道路使用者检测算法

王欢欢1,金立生1,2(),张也1,符旭朋1   

  1. 1.燕山大学车辆与能源学院,秦皇岛 066004
    2.燕山大学,河北省特种运载装备重点实验室,秦皇岛 066004
  • 收稿日期:2024-12-06 修回日期:2025-04-26 出版日期:2025-10-25 发布日期:2025-10-20
  • 通讯作者: 金立生 E-mail:jinls@ysu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2023YFB2504400);国家自然科学基金(52472440);河北省自然科学基金(F2024203112)

Vulnerable Road User Detection Method Based on Image Salient Feature Fusion

Huanhuan Wang1,Lisheng Jin1,2(),Ye Zhang1,Xupeng Fu1   

  1. 1.School of Vehicle and Energy,Yanshan University,Qinhuangdao 066004
    2.Yanshan University,Hebei Key Laboratory of Special Carrier Equipment,Qinhuangdao 066004
  • Received:2024-12-06 Revised:2025-04-26 Online:2025-10-25 Published:2025-10-20
  • Contact: Lisheng Jin E-mail:jinls@ysu.edu.cn

摘要:

针对复杂场景中弱势道路使用者检测面临的目标遮挡、特征冲突和前景背景模糊的问题,本文提出一种基于图像显著性特征融合的轻量化弱势道路使用者检测算法。首先,基于重构的方法提取图像的显著性特征,将其与彩色图像分别输入卷积神经网络。其次,构建轻量化非权重共享特征提取融合网络,实现特征深度融合。在此基础上,引入混合注意力机制,提出高效注意力层聚模块,提高关键特征利用效率。最后,在构建的复杂场景多类别弱势道路使用者数据集进行训练和测试。结果表明:提出的模型能够在复杂交通场景下高效准确地检测弱势道路使用者,平均度达到94.3%,精确率达到94.6%,FPS达到23.25 Hz,相比于基线网络YOLOv7平均精度提高了2.1%,精确率提高了3.5%。

关键词: 弱势道路使用者, 目标检测, 特征融合, 轻量化网络, 混合注意力

Abstract:

For the challenges of target occlusion, feature conflict, and foreground-background blur in the detection of vulnerable road users in complex scenarios, a lightweight detection algorithm based on the fusion of image saliency features is proposed in this paper. Firstly, saliency features of the image are extracted using a reconstruction method, and these features are input into a convolutional neural network along with the color image. Next, a lightweight non-weight-sharing feature extraction fusion network is constructed to achieve deep feature fusion. The mixed attention mechanism is then introduced, and an efficient attention layer aggregation module is proposed to enhance the utilization efficiency of key features. Finally, training and testing are conducted on the constructed multi-class vulnerable road user dataset in complex scenarios. The results show that the proposed model efficiently and accurately detects vulnerable road users in complex traffic scenes, with an average precision of 94.3%, a precision of 94.6%, and a FPS of 23.25 Hz. Compared to the baseline network YOLOv7, the average precision is improved by 2.1%, and the precision is improved by 3.5%.

Key words: vulnerable road user, target detection, feature fusion, lightweight network, mixed attention