一种高效率的多智能体协作学习通信机制  被引量:1

An Efficient Communication Framework in Multi-Agent Cooperating Learning Environment

在线阅读下载全文

作  者:赵宇航 马修军[2] Zhao Yuhang;Ma Xiujun(School of Electronics Engineering and Computer Science,Peking University,Beijing 100871;Key Laboratory of Machine Perception(Peking University),Ministry of Education,Beijing 100871)

机构地区:[1]北京大学信息科学技术学院,北京100871 [2]机器感知与智能教育部重点实验室(北京大学),北京100871

出  处:《信息安全研究》2020年第4期345-349,共5页Journal of Information Security Research

基  金:国家电网公司总部科技项目“配用电设备健康状态在线监测、高效运维和智能评价关键技术研究与应用”(PD71-18-023)。

摘  要:目前人工智能的发展日新月异,从计算机视觉到自然语言处理,再到强化学习的研究,都有了不小的突破.但是绝大部分人工智能针对的目标都是单智能体的,这些研究者的目标是让单智能体的智能能够不断的提升.然而多智能体的突破更能解决复杂的问题,例如动物种群的繁衍、人类的团队协作等等.即使单个智能体的智能不是特别高,但如果智能体之间的交流、协作能够很有效率,从整体来看,这个智能体群落的智能会比较高.目前,多智能体协作学习领域通常使用强化学习框架,但大多研究没有显式地应用通信机制,以提高整体模型的效果.提出了一种基于通信过滤的Actor-Critic算法框架,它使多智能体环境中的智能体之间能够高效地交流,即使在没有Critic指导的执行阶段,高效率的通信也能够很好地帮助智能体协作.算法框架中采用了一个神经网络过滤智能体之间的信息,完成一个使低质量的冗余信息到高质量的低维信息的过程.设计了3个实验验证模型的效果,分别是2个协作学习场景和1个自动驾驶中的车道变换任务.实验结果表明,在引入沟通的多智能体协作学习中,该算法模型比其他类似的模型效果好.At present,the development of artificial intelligence is changing rapidly.From computer vision to natural language processing to reinforcement learning research,there have been many breakthroughs.However,most of the targets of artificial intelligence are single agents.The goal of these researchers is to make the intelligence of single agents continuously improve.However,breakthroughs in multi-agents can better solve complex problems,such as the reproduction of animal populations,human teamwork,etc.Even if the intelligence of a single agent is not particularly high,if the communication and collaboration between agents can be very efficient,the intelligence of this agent community will be relatively high as a whole.Currently,reinforcement learning frameworks are commonly used in multi-agent collaborative learning fields,but most studies have not explicitly applied communication mechanisms to improve the effectiveness of the overall model.We propose an Actor-Critic algorithm framework based on communication filtering,which enables efficient communication between agents in a multi-agent environment.Efficient communication can help agents collaborate well even during the execution phase without Critic guidance.The algorithm framework uses a neural network to filter the information between agents to complete a process from low-quality redundant information to high-quality low-dimensional information.In this paper,three experiments are designed to verify the effectiveness of the model,which are two collaborative learning scenarios and a lane change task in autonomous driving.The experimental results show that our model performs better than other similar models in multi-agent collaborative learning with communication.

关 键 词:多智能系统 强化学习 协作学习 人工智能 自动驾驶 

分 类 号:TP302[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象