Human as AI mentor.Enhanced human-in the-loop reinforcement leaming for safe and efficient autonomous driving  被引量:1

在线阅读下载全文

作  者:Zilin Huang Zihao Sheng Chengyuan Ma Sikai Chen 

机构地区:[1]Department of Civil and Environmental Engineering,University of Wisconsin-Madison,Madison,WI,53706,USA

出  处:《Communications in Transportation Research》2024年第1期124-147,共24页交通研究通讯(英文)

摘  要:Despite significant progress in autonomous vehicles(AVs),the development of driving policies that ensure both the safety of AVs and traffic flow efficiency has not yet been fully explored.In this paper,we propose an enhanced human-in-the-loop reinforcement learning method,termed the Human as AI mentor-based deep reinforcement learning(HAIM-DRL)framework,which facilitates safe and efficient autonomous driving in mixed traffic platoon.Drawing inspiration from the human learning process,we first introduce an innovative learning paradigm that effectively injects human intelligence into AI,termed Human as AI mentor(HAIM).In this paradigm,the human expert serves as a mentor to the AI agent.While allowing the agent to sufficiently explore uncertain environments,the human expert can take control in dangerous situations and demonstrate correct actions to avoid potential accidents.On the other hand,the agent could be guided to minimize traffic flow disturbance,thereby optimizing traffic flow efficiency.In detail,HAIM-DRL leverages data collected from free exploration and partial human demonstrations as its two training sources.Remarkably,we circumvent the intricate process of manually designing reward functions;instead,we directly derive proxy state-action values from partial human demonstrations to guide the agents’policy learning.Additionally,we employ a minimal intervention technique to reduce the human mentor’s cognitive load.Comparative results show that HAIM-DRL outperforms traditional methods in driving safety,sampling efficiency,mitigation of traffic flow disturbance,and generalizability to unseen traffic scenarios.

关 键 词:Human as AI mentor paradigm Autonomous driving Deep reinforcement learning Human-in-the-loop learning Driving policy Mixed traffic platoon 

分 类 号:TN9[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象