基于改进多智能体Nash Q Learning的交通信号协调控制  

Traffic Signal Coordination Control Based on Improved Multi-Agent Nash Q Learning

在线阅读下载全文

作  者:苏港 叶宝林 姚青[1] 陈滨 张一嘉 SU Gang;YE Baolin;YAO Qing;CHEN Bin;ZHANG Yijia(School of Information Science and Engineering,Zhejiang Sci-Tech University,Hangzhou 310018,China;Jiaxing Key Laboratory of Smart Transportations,Jiaxing University,Jiaxing 314001,China)

机构地区:[1]浙江理工大学信息科学与工程学院,浙江杭州310018 [2]嘉兴大学嘉兴市智慧交通重点实验室,浙江嘉兴314001

出  处:《软件工程》2024年第10期43-49,共7页Software Engineering

基  金:国家自然科学基金资助项目(61603154);浙江省自然科学基金资助项目(LTGS23F030002);嘉兴市应用性基础研究项目(2023AY11034);工业控制技术国家重点实验室开放课题(ICT2022B52)。

摘  要:为了优化区域交通信号配时方案,提升区域通行效率,文章提出一种基于改进多智能体Nash Q Learning的区域交通信号协调控制方法。首先,采用离散化编码方法,通过划分单元格将连续状态信息转化为离散形式。其次,在算法中融入长短时记忆网络(Long Short Term Memory,LSTM)模块,用于从状态数据中挖掘更多的隐藏信息,丰富Q值表中的状态数据。最后,基于微观交通仿真软件SUMO(Simulation of Urban Mobility)的仿真测试结果表明,相较于原始Nash Q Learning交通信号控制方法,所提方法在低、中、高流量下车辆的平均等待时间分别减少了11.5%、16.2%和10.0%,平均排队长度分别减少了9.1%、8.2%和7.6%,平均停车次数分别减少了18.3%、16.1%和10.0%。结果证明了该算法具有更好的控制效果。In order to optimize the coordination timing scheme of regional traffic signals and improve traffic efficiency,this paper proposes a regional traffic signal coordination control method based on an improved multi-agent Nash Q Learning.First,a discretization coding method is employed to convert continuous state information into a discrete form by dividing it into cells.Second,a Long Short Term Memory(LSTM)module is incorporated into the algorithm to mine more hidden information from state data and enrich the state data in the Q value table.Finally,simulation tests based on the microscopic traffic simulation software SUMO(Simulation of Urban Mobility)show that,compared to the original Nash Q Learning traffic signal control method,the proposed method reduces the average waiting time for vehicles by 11.5%,16.2%,and 10.0%under low,medium,and high traffic flows,respectively.It also decreases the average queue length by 9.1%,8.2%,and 7.6%,and reduces the average number of stops by 18.3%,16.1%,and 10.0%.The results demonstrate that this algorithm achieves better control performance.

关 键 词:区域交通信号协调控制 马尔科夫决策 多智能体Nash Q Learning LSTM SUMO 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象