检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵宇航 马修军[2] Zhao Yuhang;Ma Xiujun(School of Electronics Engineering and Computer Science,Peking University,Beijing 100871;Key Laboratory of Machine Perception(Peking University),Ministry of Education,Beijing 100871)
机构地区:[1]北京大学信息科学技术学院,北京100871 [2]机器感知与智能教育部重点实验室(北京大学),北京100871
出 处:《信息安全研究》2020年第4期345-349,共5页Journal of Information Security Research
基 金:国家电网公司总部科技项目“配用电设备健康状态在线监测、高效运维和智能评价关键技术研究与应用”(PD71-18-023)。
摘 要:目前人工智能的发展日新月异,从计算机视觉到自然语言处理,再到强化学习的研究,都有了不小的突破.但是绝大部分人工智能针对的目标都是单智能体的,这些研究者的目标是让单智能体的智能能够不断的提升.然而多智能体的突破更能解决复杂的问题,例如动物种群的繁衍、人类的团队协作等等.即使单个智能体的智能不是特别高,但如果智能体之间的交流、协作能够很有效率,从整体来看,这个智能体群落的智能会比较高.目前,多智能体协作学习领域通常使用强化学习框架,但大多研究没有显式地应用通信机制,以提高整体模型的效果.提出了一种基于通信过滤的Actor-Critic算法框架,它使多智能体环境中的智能体之间能够高效地交流,即使在没有Critic指导的执行阶段,高效率的通信也能够很好地帮助智能体协作.算法框架中采用了一个神经网络过滤智能体之间的信息,完成一个使低质量的冗余信息到高质量的低维信息的过程.设计了3个实验验证模型的效果,分别是2个协作学习场景和1个自动驾驶中的车道变换任务.实验结果表明,在引入沟通的多智能体协作学习中,该算法模型比其他类似的模型效果好.At present,the development of artificial intelligence is changing rapidly.From computer vision to natural language processing to reinforcement learning research,there have been many breakthroughs.However,most of the targets of artificial intelligence are single agents.The goal of these researchers is to make the intelligence of single agents continuously improve.However,breakthroughs in multi-agents can better solve complex problems,such as the reproduction of animal populations,human teamwork,etc.Even if the intelligence of a single agent is not particularly high,if the communication and collaboration between agents can be very efficient,the intelligence of this agent community will be relatively high as a whole.Currently,reinforcement learning frameworks are commonly used in multi-agent collaborative learning fields,but most studies have not explicitly applied communication mechanisms to improve the effectiveness of the overall model.We propose an Actor-Critic algorithm framework based on communication filtering,which enables efficient communication between agents in a multi-agent environment.Efficient communication can help agents collaborate well even during the execution phase without Critic guidance.The algorithm framework uses a neural network to filter the information between agents to complete a process from low-quality redundant information to high-quality low-dimensional information.In this paper,three experiments are designed to verify the effectiveness of the model,which are two collaborative learning scenarios and a lane change task in autonomous driving.The experimental results show that our model performs better than other similar models in multi-agent collaborative learning with communication.
关 键 词:多智能系统 强化学习 协作学习 人工智能 自动驾驶
分 类 号:TP302[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145