基于Q-learning的多业务网络选择博弈策略  

Multi-service network selection game strategy based on Q-learning

在线阅读下载全文

作  者:王军选[1] 赵县 王颖[2] WANG Junxuan;ZHAO Xian;WANG Ying(Shaanxi Key Laboratory of Information and Communication Network and Security,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;Information Security Center,Beijing University of Posts and Telecommunications,Beijing 100876,China)

机构地区:[1]西安邮电大学陕西省信息通信网络及安全重点实验室,陕西西安710121 [2]北京邮电大学信息安全中心,中国北京100876

出  处:《西安邮电大学学报》2023年第4期1-8,共8页Journal of Xi’an University of Posts and Telecommunications

基  金:陕西省重点研发计划项目(2020ZDLGY02-06)。

摘  要:为了增加网络吞吐量并改善用户体验,提出一种基于Q学习(Q-learning)的多业务网络选择博弈(Multi-Service Network Selection Game based on Q-learning,QSNG)策略。该策略通过模糊推理和综合属性评估获得多业务网络效用函数,并将其用作Q-learning的奖励。用户通过博弈算法预测网络选择策略收益,避免访问负载较重的网络。同时,使用二进制指数退避算法减少多个用户并发访问某个网络的概率。仿真结果表明,所提策略可以根据用户的QoS需求和价格偏好自适应地切换到最合适的网络,将其与基于强化学习的网络辅助反馈(Reinforcement Learning with Network-Assisted Feedback,RLNF)策略和无线网络选择博弈(Radio Network Selection Games,RSG)策略相比,所提策略可以分别减少总切换数量的80%和60%,使网络吞吐量分别提高了7%和8%,并且可以保证系统的公平性。In order to increase the network throughput and improve user experiences,a multi-service network selection game based on Q-learning(QSNG)strategy is proposed.The scheme obtains the multi-service network utility function through fuzzy reasoning and comprehensive attribute evaluation,and uses it as the reward of Q-learning.The user predicts the payoff of the network selection strategy through a game algorithm to avoid accessing a heavily loaded network.Meanwhile,the scheme also uses the binary exponential back-off algorithm to reduce the probability of multiple users accessing a certain network concurrently.The simulation results show that the scheme can adaptively switch to the most suitable network according to the user’s QoS requirements and cost preference.Compared with the reinforcement learning with network-assisted feedback(RLNF)and the radio network selection games(RSG)strategies,the proposed strategy can reduce the total switching times by 80%and 60%,while increasing network throughput by 7%and 8%,and it can guarantee the fairness of the system.

关 键 词:多业务网络选择 综合属性评估 二进制指数退避算法 Q学习 

分 类 号:TN929[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象