检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙爱晶 魏德 孙驰 SUN Aijing;WEI De;SUN Chi(School of Communications and Information Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;Shaanxi Key Laboratory of Information Communication Network and Security,Xi’an 710121,China)
机构地区:[1]西安邮电大学通信与信息工程学院,陕西西安710121 [2]陕西省信息通信网络及安全重点实验室,陕西西安710121
出 处:《西安邮电大学学报》2024年第3期1-11,共11页Journal of Xi’an University of Posts and Telecommunications
基 金:国家自然科学基金项目(62271391);陕西省教育厅服务地方专项科研项目(21JC032)。
摘 要:为了减小无人机辅助无线传感器网络(Unmanned Aerial Vehicle Assisted Wireless Sensor Network,UAV-WSN)数据收集的信息新鲜度(the Age of Information,AoI),提出一种改进的深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法。构建最小AoI的马尔可夫决策过程(Markov Decision Process,MDP)模型,通过经验回放矩阵和双层网络结构提高算法的收敛速度。将玻尔兹曼策略引入搜索策略中,解决UAV-WSN系统在选择最优动作时局部最优的问题,采用多层长短期记忆神经网络模型,以控制经验池中信息的记忆和遗忘程度,避免算法训练时回合间相互影响。将所提算法与演员-评论家(Actor-Critic,AC)算法、深度Q网络(Deep Q-Network,DQN)算法、DDPG算法及random算法对比,结果表明,改进的DDPG算法具有较好的收敛性和稳定性,能够最小化AoI。In order to reduce the age of information(AoI)of data collection in unmanned aerial vehicle assisted wireless sensor network(UAV-WSN),an improved deep deterministic policy gradient(DDPG)algorithm is proposed.The Markov decision process(MDP)model with the minimum AoI is constructed.The convergence speed of the algorithm is improved by the experience playback matrix and the two-layer network structure.The Boltzmann strategy is introduced into the search strategy to solve the UAV-WSN system.The problem of local optimum when selecting the optimal action is introduced into the multi-layer long-term and short-term memory neural network model to control the memory and forgetting degree of information in the experience pool,and avoid the mutual influence between rounds during algorithm training.The proposed algorithm is compared with the actor-critic(AC)algorithm,the deep Q-network(DQN)algorithm,the DDPG algorithm,and the random algorithm.The results show that the improved DDPG algorithm has better convergence and stability,and can minimize the AoI.
关 键 词:无人机 无线传感器网络 深度确定性策略梯度 信息新鲜度 玻尔兹曼策略 长短记忆神经网络
分 类 号:TN929[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.247.50