汽车工程 ›› 2023, Vol. 45 ›› Issue (6): 974-988.doi: 10.19562/j.chinasae.qcgc.2023.06.008
所属专题: 智能网联汽车技术专题-感知&HMI&测评2023年
收稿日期:
2022-11-18
修回日期:
2023-01-17
出版日期:
2023-06-25
发布日期:
2023-06-16
通讯作者:
葛振振
E-mail:gezhenzhen@chd.edu.cn
基金资助:
Xia Zhao,Zhao Li,Rui Fu,Zhenzhen Ge(),Chang Wang
Received:
2022-11-18
Revised:
2023-01-17
Online:
2023-06-25
Published:
2023-06-16
Contact:
Zhenzhen Ge
E-mail:gezhenzhen@chd.edu.cn
摘要:
针对基于端到端深度卷积神经网络的驾驶行为检测模型缺乏全局特征提取能力以及视觉Transformer(vision transformer,ViT)模型不擅长捕捉底层特征和模型参数量较大的问题,本文提出一种基于深度卷积和Tokens降维的ViT模型用于驾驶人分心驾驶行为实时检测,并通过开展与其他模型的对比试验、所提模型的消融试验和模型注意力区域的可视化试验充分验证了所提模型的优越性。本文所提模型的平均分类准确率和精确率分别为96.93%和96.95%,模型参数量为21.22 M,基于真实车辆平台在线推理速度为23.32 fps,表明所提模型能够实现实时分心驾驶行为检测。研究结果有利于人机共驾系统的控制策略制定和分心预警。
赵霞,李朝,付锐,葛振振,王畅. 基于深度卷积-Tokens降维优化视觉Transformer的分心驾驶行为实时检测[J]. 汽车工程, 2023, 45(6): 974-988.
Xia Zhao,Zhao Li,Rui Fu,Zhenzhen Ge,Chang Wang. Real-Time Detection of Distracted Driving Behavior Based on Deep Convolution-Tokens Dimensionality Reduction Optimized Visual Transformer Model[J]. Automotive Engineering, 2023, 45(6): 974-988.
表3
各模型P和mP"
驾驶员行为 | Co-Td-ViT | DenseNet | ResNet-101 | EfficientNet | Inception-v4 | Swin |
---|---|---|---|---|---|---|
双手驾驶 | 98.37 | 97.56 | 97.98 | 94.86 | 93.77 | 97.23 |
看手机 | 98.19 | 98.18 | 97.83 | 94.27 | 94.62 | 94.68 |
手机导航 | 96.96 | 96.93 | 97.32 | 91.76 | 90.6 | 94.3 |
操作中控系统 | 97.32 | 96.46 | 96.48 | 94.20 | 91.23 | 94.62 |
喝水 | 97.45 | 96.73 | 97.09 | 94.89 | 91.23 | 93.86 |
打电话 | 93.62 | 92.31 | 92.61 | 90.46 | 89.82 | 92.70 |
回头聊天 | 98.88 | 98.50 | 98.50 | 95.90 | 97.00 | 97.05 |
单手驾驶 | 94.86 | 94.07 | 94.07 | 94.61 | 94.40 | 95.93 |
平均 | 96.95 | 96.34 | 96.48 | 93.86 | 92.83 | 95.05 |
表4
各模型R和mR"
驾驶员行为 | Co-Td-ViT | DenseNet | ResNet-101 | EfficientNet | Inception-v4 | Swin |
---|---|---|---|---|---|---|
双手驾驶 | 94.53 | 93.75 | 94.53 | 93.75 | 94.14 | 96.09 |
看手机 | 95.77 | 95.07 | 95.07 | 92.61 | 92.96 | 94.01 |
手机导航 | 97.70 | 96.93 | 97.32 | 93.87 | 92.34 | 95.02 |
操作中控系统 | 98.64 | 98.64 | 99.10 | 95.48 | 94.12 | 95.48 |
喝水 | 97.45 | 96.73 | 97.09 | 94.55 | 94.55 | 94.55 |
打电话 | 98.51 | 98.51 | 98.13 | 95.52 | 92.16 | 94.78 |
回头聊天 | 97.79 | 96.69 | 96.32 | 94.49 | 95.22 | 96.69 |
单手驾驶 | 95.24 | 94.44 | 94.44 | 90.48 | 86.90 | 93.65 |
平均 | 96.95 | 96.35 | 96.50 | 93.84 | 92.79 | 95.03 |
表6
各模型P和mP"
驾驶员行为 | Co-Td-ViT | ViT-1 | ViT-2 | Co-ViT | Td-ViT |
---|---|---|---|---|---|
双手驾驶 | 93.70 | 94.12 | 96.08 | 94.92 | 93.70 |
看手机 | 92.36 | 93.38 | 95.76 | 94.76 | 92.36 |
手机导航 | 90.64 | 92.05 | 93.18 | 92.8 | 90.64 |
操作中控系统 | 91.34 | 92.58 | 94.25 | 93.48 | 91.34 |
喝水 | 91.01 | 91.40 | 92.93 | 92.45 | 91.01 |
打电话 | 91.39 | 91.14 | 92.96 | 92.59 | 91.39 |
回头聊天 | 96.58 | 97.33 | 98.11 | 98.09 | 96.58 |
单手驾驶 | 95.44 | 95.45 | 96.31 | 96.3 | 95.44 |
平均 | 92.81 | 93.43 | 94.95 | 94.42 | 92.81 |
表7
各模型R和mR"
驾驶员行为 | Co-Td-ViT | ViT-1 | ViT-2 | Co-ViT | Td-ViT |
---|---|---|---|---|---|
双手驾驶 | 92.97 | 93.75 | 95.7 | 94.92 | 92.97 |
看手机 | 93.66 | 94.37 | 95.42 | 95.42 | 93.66 |
手机导航 | 92.72 | 93.1 | 94.25 | 93.87 | 92.72 |
操作中控系统 | 95.48 | 95.93 | 96.38 | 97.29 | 95.48 |
喝水 | 92.00 | 92.73 | 95.64 | 93.45 | 92.00 |
打电话 | 91.04 | 92.16 | 93.66 | 93.28 | 91.04 |
回头聊天 | 93.38 | 93.75 | 95.22 | 94.49 | 93.38 |
单手驾驶 | 91.27 | 91.67 | 93.25 | 92.86 | 91.27 |
平均 | 92.82 | 93.43 | 94.94 | 94.45 | 92.82 |
1 | 胡云峰, 曲婷, 刘俊,等. 智能汽车人机协同控制的研究现状与展望[J]. 自动化学报, 2019, 45(7): 1261-1280. |
HU Yunfeng, QU Ting, LIU Jun, et al. Human-machine cooperative control of intelligent vehicle: recent developments and future perspectives[J]. Acta Automatica Sinica, 2019, 45(7): 1261-1280. | |
2 | LI M J, CAO H T, SONG X L, et al. Shared control driver assistance system based on driving intention and situation assessment[J]. IEEE Transactions on Industrial Informatics, 2018, 14(11): 4982-4994. |
3 | UCAR S, MURALIDHARAN H, SISBOT E A, et al. Distracted driving detection[C]. 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). IEEE, 2022: 70-72. |
4 | 康小刚. 基于脑电信号的驾驶疲劳状态检测及缓解方法研究[D]. 吉林: 东北电力大学, 2022. |
KANG Xiaogang. Driver distraction characteristics and intervention method research[D]. Jilin: Northeast Electric Power University, 2022. | |
5 | CHAI R, NAIK G R, NGUYEN T N, et al. Driver fatigue classification with independent component by entropy rate bound minimization analysis in an EEG-based system[J]. IEEE Journal of Biomedical and Health Informatics, 2017, 21(3): 715-724. |
6 | NALLAPERUMA D, DE SILVA D, ALAHAKOON D, et al. Intelligent detection of driver behavior changes for effective coordination between autonomous and human driven vehicles[C]. IECON 2018 - 44TH Annual Conference of the IEEE Industrial Electronics Society, 2018: 3120-3125. |
7 | LI Z J, BAO S, KOLMANOVSKY I V, et al. Visual-manual distraction detection using driving performance indicators with naturalistic driving data[J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 19(8): 2528-2535. |
8 | VICENTE F, HUANG Z H, XIONG X H, et al. Driver gaze tracking and eyes off the road detection system[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(4): 2014-2027. |
9 | HUANG T, FU R, CHEN Y, et al. Real-time driver behavior detection based on deep deformable inverted residual network with an attention mechanism for human-vehicle co-driving system[J]. IEEE Transactions on Vehicular Technology, 2022: 1-14. |
10 | 尹智帅, 钟恕, 聂琳真,等. 基于人体姿态估计的分心驾驶行为检测[J]. 中国公路学报, 2022, 35(6): 312-323. |
YIN Zhishuai, ZHONG Shu, NIE Linzhen, et al. Distracted driving behavior detection based on human pose estimation[J]. China Journal of Highway and Transport, 2022, 35(6): 312-323. | |
11 | LI L, ZHONG B, HUTMACHER C, et al. Detection of driver manual distraction via image-based hand and ear recognition[J]. Accident Analysis & Prevention, 2020, 137: 105432. |
12 | YUEN K, MARTIN S, TRIVEDI M M. Looking at faces in a vehicle: a deep CNN based approach and evaluation[C]. 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2016: 649-654. |
13 | LE T H N, ZHU C, ZHENG Y, et al. DeepSafeDrive: a grammar-aware driver parsing approach to driver behavioral situational awareness (DB-SAW)[J]. Pattern Recognition, 2017, 66: 229-238. |
14 | LI W, HUANG J, XIE G, et al. A survey on vision-based driver distraction analysis[J]. Journal of Systems Architecture, 2021, 121: 102319. |
15 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
16 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv e-prints, 2014: arXiv:. |
17 | SZEGEDY C, WEI L, JIA Y Q, et al. Going deeper with convolutions[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1-9. |
18 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778. |
19 | CHOLLET F. Xception: deep learning with depthwise separable convolutions[J]. arXiv e-prints, 2016: arXiv:. |
20 | XING Y, TANG J, LIU H, et al. End-to-end driving activities and secondary tasks eecognition using deep convolutional neural network and transfer learning[C]. 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2018: 1626-1631. |
21 | HSSAYENI M, SAXENA S, PTUCHA R, et al. Distracted driver detection: deep learning vs handcrafted features[J]. Electronic Imaging, 2017, 2017: 20-26. |
22 | ZHANG R, KE X. Study on distracted driving behavior based on transfer learning[C]. 2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2022: 1315-1319. |
23 | TRAN D, MANH DO H, SHENG W, et al. Real-time detection of distracted driving based on deep learning[J]. IET Intelligent Transport Systems, 2018, 12(10): 1210-1219. |
24 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J]. arXiv preprint arXiv:, 2020. |
25 | HAN K, WANG Y, CHEN H, et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. |
26 | DEVLIN J, CHANG M W, LEE K, et al. Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:, 2018. |
27 | WU B, XU C, DAI X, et al. Visual transformers: Token-based image representation and processing for computer vision[J]. arXiv e-prints, 2020: arXiv:. |
28 | HAN K, XIAO A, WU E H, et al. Transformer in transformer[J]. Advances in Neural Information Processing Systems, 2021, 34: 15908-15919. |
29 | LI Y, WANG L F, MI W, et al. Distracted driving detection by combining ViT and CNN[C]. 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 2022: 908-913. |
30 | CHEN Y, DAI X, CHEN D, et al. Mobile-former: bridging mobilenet and transformer[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5270-5279. |
31 | MEHTA S, RASTEGARI M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv preprint arXiv:, 2021. |
32 | WANG W, XIE E, LI X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 568-578. |
33 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30. |
34 | SHAHVERDY M, FATHY M, BERANGI R, et al. Driver behavior detection and classification using deep convolutional neural networks[J]. Expert Systems with Applications, 2020, 149: 113240. |
35 | ERAQI H M, ABOUELNAGA Y, SAAD M H, et al. Driver distraction identification with an ensemble of convolutional neural networks[J]. Journal of Advanced Transportation, 2019, 2019: 4125865. |
36 | LI X, YU L, CHANG D, et al. Dual cross-entropy loss for small-sample fine-grained vehicle classification[J]. IEEE Transactions on Vehicular Technology, 2019, 68(5): 4204-4212. |
37 | PHAN T H, YAMAMOTO K. Resolving class imbalance in object detection with weighted cross entropy losses[J]. arXiv preprint arXiv:, 2020. |
38 | MARKOULIDAKIS I, RALLIS I, GEORGOULAS I, et al. Multiclass confusion matrix reduction method and its application on net promoter score classification problem[J]. Technologies, 2021, 9(4): 81. |
39 | ABDUALGALIL B, ABRAHAM S. Applications of machine learning algorithms and performance comparison: a review[C]. 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). IEEE, 2020: 1-6. |
[1] | 管欣,仲昭辉,詹军,奚腾龙,叶昊,高深圳,成健,廖世辉,蔡均. 基于卷积神经网络的汽车操纵稳定性试验类型识别方法[J]. 汽车工程, 2023, 45(9): 1765-1771. |
[2] | 王明,唐小林,杨凯,李国法,胡晓松. 考虑预测风险的自动驾驶车辆运动规划方法[J]. 汽车工程, 2023, 45(8): 1362-1372. |
[3] | 张雷, 关可人, 丁晓林, 郭鹏宇, 王震坡, 孙逢春. 基于图像识别与动力学融合的路面附着系数估计方法[J]. 汽车工程, 2023, 45(7): 1222-1234. |
[4] | 傅耀宇, 周二振, 丁瑞阳, 周云波, 付条奇, 张明. 某车辆滚翻过程中乘员颈部动态响应[J]. 汽车工程, 2023, 45(7): 1276-1285. |
[5] | 金立生,纪丙东,郭柏苍. 基于多层时空融合网络的驾驶人注意力预测[J]. 汽车工程, 2023, 45(5): 759-767. |
[6] | 史培龙,赵轩,陈子童,余强. 基于道路行驶工况辨识的重型载货汽车排气制动系统主动控制研究[J]. 汽车工程, 2023, 45(1): 104-111. |
[7] | 毕贵红,谢旭,蔡子龙,骆钊,陈臣鹏,赵鑫. 动态条件下基于深度学习的锂电池容量估计[J]. 汽车工程, 2022, 44(6): 868-878. |
[8] | 冯润泽,江昆,于伟光,杨殿阁. 基于两阶段分类算法的中国交通标志牌识别[J]. 汽车工程, 2022, 44(3): 434-441. |
[9] | 于海,邓钧君,王震坡,孙逢春. 基于卷积神经网络的逆变器故障诊断方法[J]. 汽车工程, 2022, 44(1): 142-152. |
[10] | 夏祥腾,王大方,曹江,赵刚,张京明. 基于稀疏卷积神经网络的车载激光雷达点云语义分割方法[J]. 汽车工程, 2022, 44(1): 26-35. |
[11] | 施冬梅,肖锋. 基于改进长短时记忆网络的驾驶行为检测方法研究[J]. 汽车工程, 2021, 43(8): 1203-1209. |
[12] | 王海, 王宽, 蔡英凤, 刘泽, 陈龙. 基于改进级联卷积神经网络的交通标志识别*[J]. 汽车工程, 2020, 42(9): 1256-1262. |
[13] | 代金坤, 罗玉涛, 梁伟强. 无人车行驶环境图像的几何测距*[J]. 汽车工程, 2020, 42(8): 1034-1039. |
[14] | 邹铁方, 王冠, 胡林, 武和全. 汽车摩托车碰撞事故中骑乘人员损伤差异对比研究*[J]. 汽车工程, 2020, 42(5): 621-627. |
[15] | 彭运赛, 夏飞, 袁博, 王志成, 罗志疆. 基于改进CNN和信息融合的动力电池组故障诊断方法*[J]. 汽车工程, 2020, 42(11): 1529-1536. |
|