检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王龙翔[1] 董凯 李小轩 董小社[1] 张兴军[1] 朱正东[1] 王宇菲 张利平 WANG Longxiang;DONG Kai;LI Xiaoxuan;DONG Xiaoshe;ZHANG Xingjun;ZHU Zhengdong;WANG Yufei;ZHANG Liping(School of Computer Science and Technology,Xi’an Jiaotong University,Xi’an 710049,China;Information Center,Xi’an Academy of Fine Arts,Xi’an 710065,China)
机构地区:[1]西安交通大学计算机科学与技术学院,西安710049 [2]西安美术学院信息中心,西安710065
出 处:《西安交通大学学报》2021年第5期83-91,共9页Journal of Xi'an Jiaotong University
基 金:国家重点研发计划资助项目(2018YFB0203902)。
摘 要:为优化虚拟数据空间网络传输性能,提出了基于近端策略优化的智能TCP拥塞控制算法TCP-PPO2。将TCP拥塞控制过程抽象为一个可部分观察的马尔可夫决策过程,在该过程中构建一个智能体,与网络环境进行互动。智能体通过观察网络状态特征对拥塞窗口长度进行调节,网络环境向智能体反馈奖励值,智能体尝试最大化回合内获得奖励期望值。设计了包括吞吐率、网络时延等网络特征的状态空间,使智能体能够观察到足够多的信息进行决策并且降低性能开销。通过加权算法设计奖励函数,使智能体能够平衡优化吞吐率与时延。通过近端策略优化算法更新智能体模型参数,对过大的参数更新进行截断,将参数更新限制在一定范围内,减少梯度下降过程中出现的振荡,实现训练过程的快速收敛。在NS3模拟器上实现了基于近端策略优化的TCP拥塞控制算法,并与Cubic、HighSpeed和NewReno等主流拥塞控制算法进行了对比,结果表明:TCP-PPO2吞吐率性能可达对比算法的2~3倍以上;80%的采样点时延相比链路最小时延值只增加了4%。To optimize the transmission performance of virtual data space network,an intelligent TCP congestion control algorithm based on proximal policy optimization is proposed(TCPPPO2).The TCP congestion control process is abstracted as a Markov decision process,which can be partially observed.In this process,an agent is constructed to interact with the network environment.The agent adjusts the size of congestion window by observing the characteristics of network state.The network environment feeds back a reward value to the agent,and the agent tries to maximize the expected reward value in an episode.The state space including throughput,network delay and other network characteristics is designed,so that agents can observe enough information to make decisions and reduce performance overhead.The weighted reward function is designed to balance the throughput and delay.The parameters of the agent model are updated by the proximal policy optimization algorithm,and excessive parameter updates are truncated.The parameter update is limited to a certain range,which reduces the oscillation problem in the process of gradient descent,and realizes quick convergence of the training process.The TCP congestion control algorithm based on proximal policy optimization is implemented on NS3 simulator,and compared with the mainstream congestion control algorithms such as cubic,HighSpeed and NewReno.The results show that the throughput performance of TCP-PPO2 can reach more than 2-3 times of the comparison method,while the delay value of 80% of the sampling points only increases 4%compared with the minimum link delay.
分 类 号:TP319[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.152.124