基于多智能体强化学习的动态频谱分配方法综述  被引量:4

Review of multi-agent reinforcement learning based dynamic spectrum allocation method

在线阅读下载全文

作  者:宋波 叶伟 孟祥辉 SONG Bo;YE Wei;MENG Xianghui(Department of Electronic and Optical Engineering, Space Engineering University, Beijing 101416, China;Unit 95801 of the PLA, Beijing 100076, China)

机构地区:[1]航天工程大学电子与光学工程系,北京101416 [2]中国人民解放军95801部队,北京100076

出  处:《系统工程与电子技术》2021年第11期3338-3351,共14页Systems Engineering and Electronics

摘  要:认知无线电和动态频谱分配技术是解决频谱资源短缺问题的有效手段。随着近年来深度学习和强化学习等机器学习技术迅速发展,以多智能体强化学习为代表的群体智能技术不断取得突破,使得分布式智能动态频谱分配成为可能。本文详细梳理了强化学习和多智能体强化学习领域关键研究成果,以及基于多智能体强化学习的动态频谱分配过程建模方法与算法研究。并将现有算法归结为独立Q-学习、合作Q-学习、联合Q-学习和多智能体行动器评判器算法4种,分析了这些方法的优点与不足,总结并给出了基于多智能体强化学习的动态频谱分配方法的关键问题与解决思路。Cognitive radio and dynamic spectrum allocation technology are effective means to solve the scarcity of spectrum.With the rapid development of machine learning technology including deep learning and reinforcement learning in recent years,the swarm intelligence technology represented by multi-agent reinforcement learning is continuously making breakthroughs,which is also making distributed and intelligent dynamic spectrum allocation possible.This paper reviews the key research achievements in reinforcement learning and multi-agent reinforcement learning in detail,as well as research in modeling methods and algorithms of dynamic spectrum allocation process based on multi-agent reinforcement learning.The method could boil down to four types:independent Q-learning,cooperating Q-learning,joint Q-learning and multi-agent actor-critic.The advantages and disadvantages of the existing four types of methods are analyzed,and the critical problems and possible solutions of the dynamic spectrum allocation method based on multi-agent reinforcement learning are summarized.

关 键 词:频谱管理 认知无线电 动态频谱分配 机器学习 强化学习 多智能体强化学习 

分 类 号:TN929.5[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象