检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Zilin Huang Zihao Sheng Chengyuan Ma Sikai Chen
出 处:《Communications in Transportation Research》2024年第1期124-147,共24页交通研究通讯(英文)
摘 要:Despite significant progress in autonomous vehicles(AVs),the development of driving policies that ensure both the safety of AVs and traffic flow efficiency has not yet been fully explored.In this paper,we propose an enhanced human-in-the-loop reinforcement learning method,termed the Human as AI mentor-based deep reinforcement learning(HAIM-DRL)framework,which facilitates safe and efficient autonomous driving in mixed traffic platoon.Drawing inspiration from the human learning process,we first introduce an innovative learning paradigm that effectively injects human intelligence into AI,termed Human as AI mentor(HAIM).In this paradigm,the human expert serves as a mentor to the AI agent.While allowing the agent to sufficiently explore uncertain environments,the human expert can take control in dangerous situations and demonstrate correct actions to avoid potential accidents.On the other hand,the agent could be guided to minimize traffic flow disturbance,thereby optimizing traffic flow efficiency.In detail,HAIM-DRL leverages data collected from free exploration and partial human demonstrations as its two training sources.Remarkably,we circumvent the intricate process of manually designing reward functions;instead,we directly derive proxy state-action values from partial human demonstrations to guide the agents’policy learning.Additionally,we employ a minimal intervention technique to reduce the human mentor’s cognitive load.Comparative results show that HAIM-DRL outperforms traditional methods in driving safety,sampling efficiency,mitigation of traffic flow disturbance,and generalizability to unseen traffic scenarios.
关 键 词:Human as AI mentor paradigm Autonomous driving Deep reinforcement learning Human-in-the-loop learning Driving policy Mixed traffic platoon
分 类 号:TN9[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.31.198