面向无人机辅助WSN的改进DDPG算法被引量：2

An improved DDPG algorithm for UAV-assisted WSN

作　　者：孙爱晶魏德孙驰 SUN Aijing;WEI De;SUN Chi(School of Communications and Information Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;Shaanxi Key Laboratory of Information Communication Network and Security,Xi’an 710121,China)

机构地区：[1]西安邮电大学通信与信息工程学院,陕西西安710121 [2]陕西省信息通信网络及安全重点实验室,陕西西安710121

出　　处：《西安邮电大学学报》2024年第3期1-11,共11页Journal of Xi’an University of Posts and Telecommunications

基　　金：国家自然科学基金项目(62271391);陕西省教育厅服务地方专项科研项目(21JC032)。

摘　　要：为了减小无人机辅助无线传感器网络(Unmanned Aerial Vehicle Assisted Wireless Sensor Network,UAV-WSN)数据收集的信息新鲜度(the Age of Information,AoI),提出一种改进的深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法。构建最小AoI的马尔可夫决策过程(Markov Decision Process,MDP)模型,通过经验回放矩阵和双层网络结构提高算法的收敛速度。将玻尔兹曼策略引入搜索策略中,解决UAV-WSN系统在选择最优动作时局部最优的问题,采用多层长短期记忆神经网络模型,以控制经验池中信息的记忆和遗忘程度,避免算法训练时回合间相互影响。将所提算法与演员-评论家(Actor-Critic,AC)算法、深度Q网络(Deep Q-Network,DQN)算法、DDPG算法及random算法对比,结果表明,改进的DDPG算法具有较好的收敛性和稳定性,能够最小化AoI。In order to reduce the age of information(AoI)of data collection in unmanned aerial vehicle assisted wireless sensor network(UAV-WSN),an improved deep deterministic policy gradient(DDPG)algorithm is proposed.The Markov decision process(MDP)model with the minimum AoI is constructed.The convergence speed of the algorithm is improved by the experience playback matrix and the two-layer network structure.The Boltzmann strategy is introduced into the search strategy to solve the UAV-WSN system.The problem of local optimum when selecting the optimal action is introduced into the multi-layer long-term and short-term memory neural network model to control the memory and forgetting degree of information in the experience pool,and avoid the mutual influence between rounds during algorithm training.The proposed algorithm is compared with the actor-critic(AC)algorithm,the deep Q-network(DQN)algorithm,the DDPG algorithm,and the random algorithm.The results show that the improved DDPG algorithm has better convergence and stability,and can minimize the AoI.

关键词：无人机无线传感器网络深度确定性策略梯度信息新鲜度玻尔兹曼策略长短记忆神经网络

分类号：TN929[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向无人机辅助WSN的改进DDPG算法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向无人机辅助WSN的改进DDPG算法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

面向无人机辅助WSN的改进DDPG算法被引量：2