汽车工程 ›› 2024, Vol. 46 ›› Issue (11): 1937-1951.doi: 10.19562/j.chinasae.qcgc.2024.11.001
• • 下一篇
收稿日期:
2024-01-23
修回日期:
2024-03-26
出版日期:
2024-11-25
发布日期:
2024-11-22
通讯作者:
褚文博
E-mail:chuwenbo@wicv.cn
基金资助:
Xiaolin Tang1,Lu Gan1,Guofa Li1,Keqiang Li2,Wenbo Chu3,4,5()
Received:
2024-01-23
Revised:
2024-03-26
Online:
2024-11-25
Published:
2024-11-22
Contact:
Wenbo Chu
E-mail:chuwenbo@wicv.cn
摘要:
随着Transformer注意力机制的出现,以GPT为代表的通用基础大模型实现了智能的“涌现”,给自动驾驶迈向更高级别发展带来了曙光。受限于传统从头预训练方式需要大规模、高质量、多样性自动驾驶数据和高昂训练成本的困扰,“大模型+对齐技术”范式衍生。对齐技术作为通用基础大模型与自动驾驶之间的纽带,通过微调或提示工程等定制化方式,可高效、专业地解决自动驾驶领域内的工程性问题。对齐技术已是大模型在垂直领域发展的研究热点,但缺乏系统研究成果。基于此,本文首先对自动驾驶发展与大模型技术进行概述,从而衍生出对齐技术。然后,分别从微调和提示工程两个角度进行综述,系统化梳理并剖析各分类技术的结构或性能特点,同时给出实际的应用案例。最后,基于现有研究提出了对齐技术的研究挑战与发展趋势,为促进自动驾驶迈向更高级别发展提供参考。
唐小林,甘露,李国法,李克强,褚文博. 面向自动驾驶的大模型对齐技术:综述[J]. 汽车工程, 2024, 46(11): 1937-1951.
Xiaolin Tang,Lu Gan,Guofa Li,Keqiang Li,Wenbo Chu. Large Model Alignment Technology for Autonomous Driving: A Review[J]. Automotive Engineering, 2024, 46(11): 1937-1951.
1 | 李晓华. 自动驾驶的发展现状、挑战与应对[J]. 人民论坛, 2023(18): 68-72. |
LI X H. Development status, challenges and responses for autonomous driving[J]. People's Tribune, 2023(18): 68-72. | |
2 | TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[J]. arXiv preprint arXiv:, 2023. |
3 | FLORIDI L, CHIRIATTI M. GPT-3: its nature, scope, limits, and consequences[J]. Minds and Machines, 2020, 30(4): 681-694. |
4 | TEAM G, ANIL R, BORGEAUD S, et al. Gemini: a family of highly capable multimodal models[J]. arXiv preprint arXiv:, 2023. |
5 | SUN Y, WANG S, FENG S, et al. Ernie 3.0: large-scale knowledge enhanced pre-training for language understanding and generation[J]. arXiv preprint arXiv:, 2021. |
6 | REN X, ZHOU P, MENG X, et al. PanGu-Σ: towards trillion parameter language model with sparse heterogeneous computing[J]. arXiv preprint arXiv:, 2023. |
7 | 李升波, 占国建, 蒋宇轩,等. 类脑学习型自动驾驶决控系统的关键技术[J]. 汽车工程, 2023,45(9): 1499-1515. |
LI S B, ZHAN G J, JIANG Y X, et al. Key technologies of brain inspired decision and control intelligence for autonomous driving systems[J]. Automotive Engineering, 2023,45(9): 1499-1515. | |
8 | CAO D, ZOLOTAS A, WANG M, et al. Preface for feature topic on human driver behaviours for intelligent vehicles[J]. Automotive Innovation, 2024: 1-3. |
9 | CUI Y, HUANG S, ZHONG J, et al. DriveLLM: charting the path toward full autonomous driving with large language models[J]. IEEE Transactions on Intelligent Vehicles, 2023: 1-15. |
10 | XU Z, ZHANG Y, XIE E, et al. Drivegpt4: interpretable end-to-end autonomous driving via large language model[J]. arXiv preprint arXiv:, 2023. |
11 | JIA X, WU P, CHEN L, et al. Think twice before driving: towards scalable decoders for end-to-end autonomous driving[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 21983-21994. |
12 | WEN L, YANG X, FU D, et al. On the road with GPT-4V(ision): early explorations of visual-language model on autonomous driving [J]. arXiv preprint arXiv:, 2023. |
13 | KUMAR V, JAIN S, SONI N, et al. Drive GPT-an AI based generative driver model[C]. SAE Paper 2024-26-0025. |
14 | 张顺, 龚怡宏, 王进军. 深度卷积神经网络的发展及其在计算机视觉领域的应用[J]. 计算机学报, 2019, 42(3): 453-482. |
ZHANG S, GONG Y H, WANG J J. The development of deep convolution neural network and its applications on computer vision[J]. Chinese Journal of Computers, 2019, 42(3): 453-482. | |
15 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C].Advances in Neural Information Processing Systems: Vol. 30. Curran Associates, Inc., 2017. |
16 | LIU S, ZENG Z, REN T, et al. Grounding DINO: marrying DINO with grounded pre-training for open-set object detection[J]. arXiv preprint arXiv:, 2023. |
17 | ZHAO Z. Enhancing autonomous driving with grounded-segment anything model: limitations and mitigations[C].2023 IEEE 3rd International Conference on Data Science and Computer Application (ICDSCA), 2023: 1258-1265. |
18 | YANG L, ZHANG Z, SONG Y, et al. Diffusion models: a comprehensive survey of methods and applications[J]. ACM Computing Surveys, 2023, 56(4): 1-39. |
19 | YANG G, QIAO Y, SHI J, et al. Long-tailed object mining based on CLIP model for autonomous driving[C].2022 4th International Conference on Control and Robotics (ICCR), 2022: 348-352. |
20 | ZHANG H, LI X, BING L. Video-LLaMA: an instruction-tuned audio-visual language model for video understanding[J]. arXiv preprint arXiv:, 2023. |
21 | ALAYRAC J B, DONAHUE J, LUC P, et al. Flamingo: a visual language model for few-shot learning[J]. Advances in Neural Information Processing Systems, 2022, 35: 23716-23736. |
22 | DAI A M, LE Q V. Semi-supervised sequence learning[J]. Advances in Neural Information Processing Systems, 2015, 28. |
23 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:, 2018. |
24 | HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[C].Proceedings of the 36th International Conference on Machine Learning. PMLR, 2019: 2790-2799. |
25 | LIANG X, WU Y, HAN J, et al. Effective adaptation in multi-task co-training for unified autonomous driving[J]. Advances in Neural Information Processing Systems, 2022, 35: 19645-19658. |
26 | JIA X, GAO Y, CHEN L, et al. DriveAdapter: breaking the coupling barrier of perception and planning in end-to-end autonomous driving[C].Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023: 7953-7963. |
27 | LIANG T, XIE H, YU K, et al. BEVFusion: a simple and robust lidar-camera fusion framework[J]. Advances in Neural Information Processing Systems, 2022, 35: 10421-10434. |
28 | MARCUZZI R, NUNES L, WIESMANN L, et al. Mask-based panoptic LiDAR segmentation for autonomous driving[J]. IEEE Robotics and Automation Letters, 2023, 8(2): 1141-1148. |
29 | CHEN L, SINAVSKI O, HÜNERMANN J, et al. Driving with LLMs: fusing object-level vector modality for explainable autonomous driving[J]. arXiv preprint arXiv:, 2023. |
30 | YU B, CHANG J, LIU L, et al. Towards a unified view on visual parameter-efficient transfer learning[J]. arXiv preprint arXiv:, 2022. |
31 | JIA P, LIU J, YANG S, et al. PM-DETR: domain adaptive prompt memory for object detection with transformers[J]. arXiv preprint arXiv:, 2023. |
32 | LIANG X, NIU M, HAN J, et al. Visual exemplar driven task-prompting for unified perception in autonomous driving[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 9611-9621. |
33 | LIANG X, LIANG X, XU H. Multi-task perception for autonomous driving[M]//FAN R, GUO S, BOCUS M J. Autonomous driving perception: fundamentals and applications. Singapore: Springer Nature, 2023: 281-321. |
34 | LI H, ZHANG R, YAO H, et al. Learning domain-aware detection head with prompt tuning[J]. Advances in Neural Information Processing Systems, 2023, 36: 4248-4262. |
35 | WANG Z, YU X, RAO Y, et al. P2P: tuning pre-trained image models for point cloud analysis with point-to-pixel prompting[J]. Advances in Neural Information Processing Systems, 2022, 35: 14388-14402. |
36 | MUNIR F, MIHAYLOVA T, AZAM S, et al. Exploring large language models for trajectory prediction: a technical perspective[C].Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. New York, NY, USA: Association for Computing Machinery, 2024: 774-778. |
37 | PENG H, LI B, ZHANG B, et al. Multi-view vision fusion network: can 2D pre-trained model boost 3D point cloud data-scarce learning?[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023: 1-1. |
38 | SAFFARI M, KHODAYAR M. Low-rank sparse generative adversarial unsupervised domain adaptation for multitarget traffic scene semantic segmentation[J]. IEEE Transactions on Industrial Informatics, 2024, 20(2): 2564-2576. |
39 | FILATOV N, KINDULOV M. Low rank adaptation for stable domain adaptation of vision transformers[J]. Optical Memory and Neural Networks, 2023, 32(2): S277-S283. |
40 | HUANG X, CHENG Z Q, HE J Y, et al. DyRoNet: dynamic routing and low-rank adapters for autonomous driving streaming perception[EB/OL]. (2024-03-08)[2024-03-20]. https://arxiv.org/abs/2403.05050v3. |
41 | HAO Z, LI Z, DANG X, et al. MM-LMF: a low-rank multimodal fusion dangerous driving behavior recognition method based on FMCW signals[J]. Electronics, 2022, 11(22): 3800. |
42 | HAN J, LIANG X, XU H, et al. SODA10M: towards large-scale object detection benchmark for autonomous driving[J]. 2023. DOI:10.48550/arXiv.2106.11118. |
43 | SIMA C, RENZ K, CHITTA K, et al. DriveLM: driving with graph visual question answering[J]. arXiv preprint arXiv:, 2023. |
44 | WANG W, XIE J, HU C, et al. DriveMLM: aligning multi-modal large language models with behavioral planning states for autonomous driving[J]. arXiv preprint arXiv:, 2023. |
45 | YOU Y, PHOO C P, LUO K, et al. Unsupervised adaptation from repeated traversals for autonomous driving[J]. Advances in Neural Information Processing Systems, 2022, 35: 27716-27729. |
46 | PANG B, XIA H, LU C. Unsupervised 3D point cloud representation learning by triangle constrained contrast for autonomous driving[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 5229-5239. |
47 | XIE Y, CHEN H, MEYER G P, et al. Cohere3D: exploiting temporal coherence for unsupervised representation learning of vision-based autonomous driving[J]. arXiv preprint arXiv:, 2024. |
48 | YUAN J, ZHANG B, YAN X, et al. AD-PT: autonomous driving pre-training with large-scale point cloud dataset[J]. Advances in Neural Information Processing Systems, 2023, 36: 47914-47933. |
49 | LI S, CHEN D, CHEN Y, et al. Unsupervised finetuning[J]. arXiv preprint arXiv:, 2021. |
50 | WANG J, LI W, WANG Y, et al. Representation-enhanced status replay network for multisource remote-sensing image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023: 1-13. |
51 | VEMULAPALLI R, VAN NGUYEN H, ZHOU S K. Unsupervised cross-modal synthesis of subject-specific scans[C].Proceedings of the IEEE International Conference on Computer Vision, 2015: 630-638. |
52 | THIAGARAJAN J J, RAMAMURTHY K N, SPANIAS A. Multiple kernel sparse representations for supervised and unsupervised learning[J]. IEEE Transactions on Image Processing, 2014, 23(7): 2905-2915. |
53 | ZHU C, ZHANG Q, CAO L, et al. Mix2Vec: unsupervised mixed data representation[C].2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020: 118-127. |
54 | SHEN Z, LIU Z, LIU Z, et al. Un-mix: rethinking image mixtures for unsupervised visual representation learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(2): 2216-2224. |
55 | WISDOM S, TZINIS E, ERDOGAN H, et al. Unsupervised speech separation using mixtures of mixtures[C].ICML 2020 Workshop on Self-supervision in Audio and Speech, 2020. |
56 | ISHIDA N, NAGATSU Y, HASHIMOTO H. Unsupervised anomaly detection based on data augmentation and mixing[C].IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, 2020: 529-533. |
57 | TANWISUTH K, ZHANG S, ZHENG H, et al. POUF: prompt-oriented unsupervised fine-tuning for large pre-trained models[C].International Conference on Machine Learning. PMLR, 2023: 33816-33832. |
58 | LIU X, JI K, FU Y, et al. P-Tuning: prompt tuning can be comparable to fine-tuning across scales and tasks[C]//MURESAN S, NAKOV P, VILLAVICENCIO A. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Dublin, Ireland: Association for Computational Linguistics, 2022: 61-68. |
59 | XU Z, WANG C, QIU M, et al. Making pre-trained language models end-to-end few-shot learners with contrastive prompt tuning[C].Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. New York, NY, USA: Association for Computing Machinery, 2023: 438-446. |
60 | NAKANO R, HILTON J, BALAJI S, et al. WebGPT: browser-assisted question-answering with human feedback[J]. arXiv preprint arXiv:, 2021. |
61 | OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[J]. Advances in Neural Information Processing Systems, 2022, 35: 27730-27744. |
62 | WU T, HE S, LIU J, et al. A brief overview of chatgpt: the history, status quo and potential future development[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(5): 1122-1136. |
63 | CUI G, YUAN L, DING N, et al. UltraFeedback: boosting language models with high-quality feedback[J]. arXiv preprint arXiv:, 2023. |
64 | CAO Y, IVANOVIC B, XIAO C, et al. Reinforcement learning with human feedback for realistic traffic simulation[J]. arXiv preprint arXiv:, 2023. |
65 | LINDNER D. Algorithmic foundations for safe and efficient reinforcement learning from human feedback[D]. ETH Zurich, 2023. |
66 | LIU J, HANG P, QI X, et al. MTD-GPT: a multi-task decision-making GPT model for autonomous driving at unsignalized intersections[C].2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2023: 5154-5161. |
67 | KWON M, XIE S M, BULLARD K, et al. Reward design with language models[J]. arXiv preprint arXiv:, 2023. |
68 | CAI X Q, ZHANG Y J, CHIANG C K, et al. Imitation learning from vague feedback[J]. Advances in Neural Information Processing Systems, 2024, 36. |
69 | RAFAILOV R, SHARMA A, MITCHELL E, et al. Direct preference optimization: your language model is secretly a reward model[J]. Advances in Neural Information Processing Systems, 2024, 36. |
70 | YANG Y, BHATT N P, INGEBRAND T, et al. Fine-tuning language models using formal methods feedback[J]. arXiv preprint arXiv:, 2023. |
71 | LIU J, HANG P, QI X, et al. MTD-GPT: a multi-task decision-making gpt model for autonomous driving at unsignalized intersections[C].2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), 2023: 5154-5161. |
72 | INOUE Y, YADA Y, TANAHASHI K, et al. NuScenes-MQA: integrated evaluation of captions and QA for autonomous driving datasets using markup annotations[C].Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 930-938. |
73 | SUN Z, SHEN S, CAO S, et al. Aligning large multimodal models with factually augmented RLHF[J]. arXiv preprint arXiv:, 2023. |
74 | YU T, YAO Y, ZHANG H, et al. RLHF-V: towards trustworthy MLLMs via behavior alignment from fine-grained correctional human feedback[J]. arXiv preprint arXiv:, 2023. |
75 | LEE S, PARK S H, JO Y, et al. Volcano: mitigating multimodal hallucination through self-feedback guided revision[J]. arXiv preprint arXiv:, 2023. |
76 | LEE H, PHATALE S, MANSOOR H, et al. RLAIF: scaling reinforcement learning from human feedback with AI feedback[J]. arXiv preprint arXiv:, 2023. |
77 | YANG K, KLEIN D, CELIKYILMAZ A, et al. RLCD: reinforcement learning from contrast distillation for language model alignment[J]. arXiv preprint arXiv:, 2023. |
78 | WU T, ZHU B, ZHANG R, et al. Pairwise proximal policy optimization: harnessing relative feedback for LLM alignment[J]. arXiv preprint arXiv:, 2023. |
79 | HÖGLUND S, KHEDRI J. Comparison between RLHF and RLAIF in fine-tuning a large language model[R]. Stockholm: KTH Royal Institute of Technology, 2023. https://www.diva-portal.org/smash/get/diva2:1782683/FULLTEXT01.pdf. |
80 | AKINWANDE V, JIANG Y, SAM D, et al. Understanding prompt engineering may not require rethinking generalization[J]. arXiv preprint arXiv:, 2023. |
81 | PRYZANT R, ITER D, LI J, et al. Automatic prompt optimization with “gradient descent” and beam search[J]. arXiv preprint arXiv:, 2023. |
82 | CHANG C C, REITTER D, AKSITOV R, et al. KL-Divergence guided temperature sampling[J]. arXiv preprint arXiv:, 2023. |
83 | ZHU Y, LI J, LI G, et al. Improving code generation by dynamic temperature sampling [J]. arXiv preprint arXiv:, 2023. |
84 | WANG Y, JIAO R, LANG C, et al. Empowering autonomous driving with large language models: a safety perspective[J]. arXiv preprint arXiv:, 2023. |
85 | YASUNAGA M, CHEN X, LI Y, et al. Large language models as analogical reasoners[J]. arXiv preprint arXiv:, 2023. |
86 | WEN L, FU D, LI X, et al. DiLu: a knowledge-driven approach to autonomous driving with large language models[J]. arXiv preprint arXiv:, 2023. |
87 | JIN Y, SHEN X, PENG H, et al. SurrealDriver: designing generative driver agent simulation framework in urban contexts based on large language model[J]. arXiv preprint arXiv:, 2023. |
88 | HAO S, GU Y, MA H, et al. Reasoning with language model is planning with world model[J]. arXiv preprint arXiv:, 2023. |
89 | ZHOU D, SCHÄRLI N, HOU L, et al. Least-to-most prompting enables complex reasoning in large language models[J]. arXiv preprint arXiv:, 2022. |
90 | PRESS O, ZHANG M, MIN S, et al. Measuring and narrowing the compositionality gap in language models[J]. arXiv preprint arXiv:, 2022. |
91 | WANG L, XU W, LAN Y, et al. Plan-and-solve prompting: improving zero-shot chain-of-thought reasoning by large language models[J]. arXiv preprint arXiv:, 2023. |
92 | HOU Y, DONG H, WANG X, et al. MetaPrompting: learning to learn better prompts [J]. arXiv preprint arXiv:, 2022. |
93 | DE WYNTER A, WANG X, GU Q, et al. On meta-prompting[J]. arXiv preprint arXiv:, 2023. |
94 | LIU J, LIU A, LU X, et al. Generated knowledge prompting for commonsense reasoning[J]. arXiv preprint arXiv:, 2021. |
95 | WANG B, DENG X, SUN H. Iteratively prompt pre-trained language models for chain of thought[J]. arXiv preprint arXiv:, 2022. |
96 | YAO S, YU D, ZHAO J, et al. Tree of thoughts: deliberate problem solving with large language models[J]. Advances in Neural Information Processing Systems, 2024, 36. |
97 | WENG G, ANDRZEJAK A. Automatic bug fixing via deliberate problem solving with large language models[C].2023 IEEE 34th International Symposium on Software Reliability Engineering Workshops (ISSREW), 2023: 34-36. |
98 | QIN Y, LIANG S, YE Y, et al. ToolLLM: facilitating large language models to master 16000+ real-world APIs[J]. arXiv preprint arXiv:, 2023. |
99 | ZHENG X, WU L, YAN Z, et al. Large language models powered context-aware motion prediction[J]. arXiv preprint arXiv:, 2024. |
100 | YANG R, ZHANG X, FERNANDEZ-LAAKSONEN A, et al. Driving style alignment for llm-powered driver agent[J]. arXiv preprint arXiv:, 2024. |
101 | WEN Y, WANG Z, SUN J. MindMap: knowledge graph prompting sparks graph of thoughts in large language models[J]. arXiv preprint arXiv:, 2023. |
102 | BESTA M, BLACH N, KUBICEK A, et al. Graph of thoughts: solving elaborate problems with large language models[J]. arXiv preprint arXiv:, 2023. |
103 | LEI B, LIN pei H, LIAO C, et al. Boosting logical reasoning in large language models through a new framework: the graph of thought[J]. arXiv preprint arXiv:, 2023. |
104 | SEL B, AL-TAWAHA A, KHATTAR V, et al. Algorithm of thoughts: enhancing exploration of ideas in large language models[J]. arXiv preprint arXiv:, 2023. |
105 | GU Y, HAN X, LIU Z, et al. PPT: pre-trained prompt tuning for few-shot learning[J]. arXiv preprint arXiv:, 2021. |
106 | LIANG X, NIU M, HAN J, et al. Visual exemplar driven task-prompting for unified perception in autonomous driving[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 9611-9621. |
107 | XIAO G, TIAN Y, CHEN B, et al. Efficient streaming language models with attention sinks[J]. arXiv preprint arXiv:, 2023. |
108 | ALLINGHAM J U, REN J, DUSENBERRY M W, et al. A simple zero-shot prompt weighting technique to improve prompt ensembling in text-image models[C].Proceedings of the 40th International Conference on Machine Learning. PMLR, 2023: 547-568. |
109 | MAO J, QIAN Y, YE J, et al. GPT-Driver: learning to drive with GPT[J]. arXiv preprint arXiv:, 2023. |
110 | ZHOU Y, MURESANU A I, HAN Z, et al. Large language models are human-level prompt engineers[J]. arXiv preprint arXiv:, 2022. |
111 | LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474. |
112 | WU Y, ZHU J, XU S, et al. RAGTruth: a hallucination corpus for developing trustworthy retrieval-augmented language models[J]. arXiv preprint arXiv:, 2023. |
113 | YUAN J, SUN S, OMEIZA D, et al. RAG-Driver: generalisable driving explanations with retrieval-augmented in-context learning in multi-modal large language model[J]. arXiv preprint arXiv:, 2024. |
114 | WEI D, GAO T, JIA Z, et al. BEV-CLIP: multi-modal BEV retrieval methodology for complex scene in autonomous driving[J]. arXiv preprint arXiv:, 2024. |
115 | DING W, CAO Y, ZHAO D, et al. RealGen: retrieval augmented generation for controllable traffic scenarios[J]. arXiv preprint arXiv:, 2023. |
116 | XIA M, ZHANG X, COUTURIER C, et al. Hybrid retrieval-augmented generation for real-time composition assistance[J]. arXiv preprint arXiv:, 2023. |
117 | YAO S, ZHAO J, YU D, et al. ReAct: synergizing reasoning and acting in language models[J]. arXiv preprint arXiv:, 2022. |
118 | FU D, LI X, WEN L, et al. Drive like a human: rethinking autonomous driving with large language models[C].Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 910-919. |
119 | CUI C, MA Y, CAO X, et al. Receive, reason, and react: drive as you say with large language models in autonomous vehicles[J]. arXiv preprint arXiv:, 2023. |
[1] | 李道飞,潘豪. 场景复杂度评估在轨迹预测和驾驶决策中的应用[J]. 汽车工程, 2024, 46(9): 1556-1563. |
[2] | 朱冰,范天昕,赵健,张培兴,宋东鉴,薛越,赵文博. 自动驾驶拟人连续交互测试场景生成方法[J]. 汽车工程, 2024, 46(9): 1600-1607. |
[3] | 张国娟,胡宏宇,李浩淼,王明剑,高菲,高镇海. 自动驾驶车辆乘坐舒适性评价研究综述[J]. 汽车工程, 2024, 46(9): 1617-1627. |
[4] | 张佳楠,胡钊政,孟杰,胡华桦,左洁. 面向车-路-图协同的分布式自动驾驶仿真平台架构及应用[J]. 汽车工程, 2024, 46(8): 1335-1345. |
[5] | 颜伏伍,向博文,胡杰,陈锐鹏,张志豪,刘昊岩,高宠智. 基于改进LPV模型的自动驾驶轻型货车横向控制[J]. 汽车工程, 2024, 46(8): 1403-1413. |
[6] | 陶乐,王海,蔡英凤,陈龙. 面向自动驾驶场景的多目标点云检测算法[J]. 汽车工程, 2024, 46(7): 1208-1218. |
[7] | 李琳辉,付一帆,王霆,王雪成,连静. 引入自监督预训练的轨迹预测方法[J]. 汽车工程, 2024, 46(7): 1219-1227. |
[8] | 王海,张桂荣,罗彤,邱梦,蔡英凤,陈龙. 面向自动驾驶道路场景中异常案例的多模态数据挖掘算法[J]. 汽车工程, 2024, 46(7): 1239-1248. |
[9] | 王海,丁玉轩,罗彤,邱梦,蔡英凤,陈龙. 驾驶场景下结合运动速度以及外观特征的多类多目标跟踪方法[J]. 汽车工程, 2024, 46(6): 956-964. |
[10] | 黄晶,刘祥臻,邓潇阳,陈然. 基于多模态轨迹预测的智能车轨迹规划研究[J]. 汽车工程, 2024, 46(6): 965-974. |
[11] | 王国栋,刘立,孟宇,杜海平,白国星,顾青. 自动驾驶汽车避撞极限研究[J]. 汽车工程, 2024, 46(6): 985-994. |
[12] | 姚福星,孙超,兰云港,卢兵,王博,于海洋. 基于混合专家模型的智能网联汽车换道决策方法[J]. 汽车工程, 2024, 46(5): 882-892. |
[13] | 李梦凡,冯忠祥,张卫华,李靖宇. 面向人机共驾模式下驾驶人接管过程的视觉转移特性研究[J]. 汽车工程, 2024, 46(5): 795-804. |
[14] | 杜国栋,邹渊,张旭东,孙文景,孙巍. 基于双估计强化学习结合前向预测控制的自动驾驶运动控制研究[J]. 汽车工程, 2024, 46(4): 564-576. |
[15] | 丁志杰,王亚飞,章翼辰,邬明宇,王亦乐. 基于复合动态采样的自动驾驶矿车节能路径规划方法[J]. 汽车工程, 2024, 46(4): 588-595. |
|