基于遗憾探索的竞争网络强化学习智能推荐方法研究  被引量:1

Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration

在线阅读下载全文

作  者:洪志理 赖俊 曹雷 陈希亮 徐志雄 HONG Zhi-li;LAI Jun;CAO Lei;CHEN Xi-liang;XU Zhi-xiong(Command&Control Engineering College,Army Engineering University of PLA,Nanjing 210007,China)

机构地区:[1]陆军工程大学指挥控制工程学院,南京210007

出  处:《计算机科学》2022年第6期149-157,共9页Computer Science

摘  要:近年来,深度强化学习在推荐系统中的应用受到了越来越多的关注。在已有研究的基础上提出了一种新的推荐模型RP-Dueling,该模型在深度强化学习Dueling-DQN的基础上加入了遗憾探索机制,使算法根据训练程度自适应地动态调整“探索-利用”占比。该算法实现了在拥有大规模状态空间的推荐系统中捕捉用户动态兴趣和对动作空间的充分探索。在多个数据集上进行测试,所提算法在MAE和RMSE两个评价指标上的最优平均结果分别达到了0.16和0.43,比目前的最优研究结果分别降低了0.48和0.56,实验结果表明所提模型优于目前已有的传统推荐模型和基于深度强化学习的推荐模型。In recent years,the application of deep reinforcement learning in recommendation system has attracted much attention.Based on the existing research,this paper proposes a new recommendation model RP-Dueling,which is based on the deep reinforcement learning Dueling-DQN algorithm,and adds the regret exploration mechanism to make the algorithm adaptively and dynamically adjust the proportion of“exploration-utilization”according to the training degree.The algorithm can capture users’dynamic interest and fully explore the action space in the recommendation system with large-scale state space.By testing the proposed algorithm model on multiple data sets,the optimal average results of MAE and RMSE are 0.16 and 0.43 respectively,which are 0.48 and 0.56 higher than the current optimal research results.Experimental results show that the proposed model is superior to the existing traditional recommendation model and recommendation model based on deep reinforcement learning.

关 键 词:推荐系统 深度强化学习 Dueling-DQN RP-Dueling 动态兴趣 遗憾探索 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象