DK-Port:基于大语言模型和强化学习的港口自动驾驶仿真环境构建与验证

DK-Port:construction and validation of port autonomous driving simulation environment based on large language models and reinforcement learning

作　　者：娄云洁艾明飞庄术洁于海王鑫滕出王江成沈甜雨郝坤坤崔文 LOU Yunjie;AI Mingfei;ZHUANG Shujie;YU Hai;WANG Xin;TENG Chu;WANG Jiangcheng;SHEN Tianyu;HAO Kunkun;CUI Wen(Qingdao Qianwan United Container Terminal Co.,Ltd.,Qingdao 266520,China;Anhui Shenxin Kechuang Technology Co.,Ltd.,Hefei 230088,China;College of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029,China;Institute of Interdisciplinary Information Core Technology(Xi'an)Co.,Ltd.,Xi'an 710077,China;Academy of Advanced Interdisciplinary Research,Xidian University,Xi'an 710071,China)

机构地区：[1]青岛前湾联合集装箱码头有限责任公司,山东青岛266520 [2]安徽深信科创信息技术有限公司,安徽合肥230088 [3]北京化工大学信息科学与技术学院,北京100029 [4]交叉信息核心技术研究院(西安)有限公司,陕西西安710077 [5]西安电子科技大学前沿交叉研究院,陕西西安710071

出　　处：《智能科学与技术学报》2025年第1期98-113,共16页Chinese Journal of Intelligent Science and Technology

基　　金：西安交通大学人机混合增强智能全国重点实验室开放课题(No.HMHAI-202416)。

摘　　要：与常规驾驶环境相比,港口环境具有车辆作业繁忙、道路定制化以及人机车辆混行等特点。为了解决港口自动驾驶数据缺乏和算法泛化性问题,缩短算法开发周期和降低开发成本,以具身智能设计理念为依托,以提供逼真可控环境、车辆交互过程为目标,提出了DK-Port港口自动驾驶仿真环境的构建与验证方法。首先,基于调查问卷和专家经验,采用零次提示和思路链等技术,使通用大语言模型参与奖励函数的设计过程;然后,基于复杂道路上车辆的行驶数据,构建丰富且逼真的人机混行仿真交互场景,并利用PPO深度强化学习算法训练对抗驾驶员模型,以揭示自动驾驶算法的安全隐患;最后在直道和交叉路口等4种典型场景下进行了对比实验。结果表明DK-Port能够有效生成更符合港口实际特性的多类型对抗驾驶行为,如危险超车、紧急切入、抢占路口交汇点等。在保证关键指标分布合理的前提下,直道场景下的变道次数是基准方法的2.9倍,交叉路口场景下的紧急制动率提升了46.7%。Port environments differ significantly from conventional driving environments,characterized by high vehicle activity,customized roadways,and mixed traffic involving human-operated and autonomous vehicles.To address the lack of port-specific autonomous driving data and improve algorithm generalization while reducing development costs and time,this paper proposes a simulation environment for port autonomous driving called DK-Port.Leveraging embodied intelligence principles,DK-Port provides a realistic and controllable environment for vehicle interactions.This paper incorporates survey-based expert input,employing zero-shot prompting and chain-of-thought reasoning techniques to enable large language models to assist in designing reward functions efficiently.Then,human-machine mixed-traffic scenarios are constructed using complex road data,and adversarial driver models are trained with the PPO reinforcement learning algorithm to identify safety vulnerabilities in autonomous driving systems.Comparative experiments in four typical scenarios,including straight roads and intersections,demonstrate that DK-Port generates diverse adversarial driving behaviors aligned with real port characteristics,such as dangerous overtaking and abrupt lane cutting.Under the premise of ensuring a reasonable distribution of key metrics,the number of lane changes in the straight road scenario is 2.9 times that of the baseline method,and the emergency braking rate in the intersection scenario is increased by 46.7%.

关键词：港口自动驾驶仿真环境构建大语言模型强化学习对抗场景生成

分类号：TP39[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

DK-Port:基于大语言模型和强化学习的港口自动驾驶仿真环境构建与验证

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

DK-Port:基于大语言模型和强化学习的港口自动驾驶仿真环境构建与验证

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索