基于多智能体深度强化学习的D2D通信资源联合分配方法  被引量:4

A Joint Resource Allocation Method of D2D Communication Resources Based on Multi-agent Deep Reinforcement Learning

在线阅读下载全文

作  者:邓炳光[1] 徐成义 张泰 孙远欣 张蔺 裴二荣[1] DENG Bingguang;XU Chengyi;ZHANG Tai;SUN Yuanxin;ZHANG Lin;PEI Errong(Institute of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Electric Power Research Institute of State Grid Sichuan Electric Power Company,Chengdu 610093,China;Chongqing Jinmei Communication Co.,Ltd,Chongqing 400035,China;State Key Laboratory of Communication Anti-interference Technology,University of Electronic Science and Technology of China,Chengdu 611731,China)

机构地区:[1]重庆邮电大学通信与信息工程学院,重庆400065 [2]国网四川省电力公司电力科学研究院,成都610093 [3]重庆金美通信有限公司,重庆400035 [4]电子科技大学通信抗干扰技术国家级重点实验室,成都611731

出  处:《电子与信息学报》2023年第4期1173-1182,共10页Journal of Electronics & Information Technology

基  金:国家重大专项(2018zx0301016);国家自然科学基金项目(62071077);重庆成渝科技创新项目(KJCXZD2020026)。

摘  要:设备对设备(D2D)通信作为一种短距离通信技术,能够极大地减轻蜂窝基站的负载压力和提高频谱利用率。然而将D2D直接部署在授权频段或者免授权频段必然导致与现有用户的严重干扰。当前联合部署在授权和免授权频段的D2D通信的资源分配通常被建模为混合整数非线性约束的组合优化问题,传统优化方法难以解决。针对这个挑战性问题,该文提出一种基于多智能体深度强化学习的D2D通信资源联合分配方法。在该算法中,将蜂窝网络中的每个D2D发射端作为智能体,智能体能够通过深度强化学习方法智能地选择接入免授权信道或者最优的授权信道并发射功率。通过选择使用免授权信道的D2D对(基于“先听后说”机制)向蜂窝基站的信息反馈,蜂窝基站能够在非协作的情况下获得WiFi网络吞吐量信息,使得算法能够在异构环境中执行并能够确保WiFi用户的QoS。与多智能体深度Q网络(MADQN)、多智能体Q学习(MAQL)和随机算法相比,所提算法在保证WiFi用户和蜂窝用户的QoS的情况下能够获得最大的吞吐量。As a short-range communication technology,Device-to-Device(D2D)communication can greatly reduce the load pressure on cellular base stations and improve spectrum utilization.However,the direct deployment of D2D to licensed or unlicensed bands will inevitably lead to serious interference with existing users.At present,the resource allocation of D2D communication jointly deployed in licensed and unlicensed bands is usually modeled as a mixed-integer nonlinear constraint combinatorial optimization problem,which is difficult to solve by traditional optimization methods.To address this challenging problem,a multi-agent deep reinforcement learning based joint resource allocation D2D communication method is proposed.In this algorithm,each D2D transmitter in the cellular network acts as an agent,which can intelligently select access to the unlicensed channel or the optimal licensed channel and it transmits power through the deep reinforcement learning method.Through the feedback of D2D pairs that compete for the unlicensed channels based on the Listen Before Talk(LBT)mechanism,WiFi network throughput information can be obtained by cellular base station in a non-cooperative manner,so that the algorithm can be executed in a heterogeneous environment and QoS of WiFi users is guaranteed.Compared with Multi Agent Deep Q Network(MADQN),Multi Agent Q Learning(MAQL)and Random Baseline algorithms,the proposed algorithm can achieve the maximum throughput while the QoS is guaranteed for both WiFi users and cellular users.

关 键 词:D2D通信 先听后说 免授权频段长期演进 资源分配 多智能体强化学习 

分 类 号:TN929.5[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象