检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张涛 张志军[1] 曹家伟 范钰敏 刘佳慧 袁卫华[1] ZHANG Tao;ZHANG Zhijun;CAO Jiawei;FAN Yumin;LIU Jiahui;YUAN Weihua(School of Computer Science and Technology,Shandong Jianzhu University,Ji′nan 250100,China)
机构地区:[1]山东建筑大学计算机科学与技术学院,山东济南250100
出 处:《软件导刊》2024年第12期27-35,共9页Software Guide
基 金:山东省自然科学基金项目(ZR2021MF099,ZR2022MF334);山东省教学改革研究项目(M2021130,M2022245,Z2022202);山东省优质专业学位教学案例库建设项目(SDYAL2022155);山东省重点研发计划(软科学项目)(2021RKY03056);“海右计划”产业领军人才本土类创新团队项目(2023)。
摘 要:深度强化学习技术在交互式推荐系统上的应用已十分成熟,但少有研究专门对状态进行表示建模,只针对用户交互过程中的正反馈序列进行状态表示建模,导致推荐系统忽略了用户交互过程中负反馈序列中存在的潜在关系及用户兴趣变化,使得推荐结果过于片面。鉴于此,提出一种基于对比学习和深度强化学习的推荐系统框架,设计了对用户和推荐系统交互过程中产生的正负反馈序列进行建模的状态表示模块。此外,为了缓解正反馈数据稀疏问题和细粒化正负反馈之间的差异性,还加入了对比辅助任务。在Movielens-100K和Movielens-1M两个真实世界的数据集上进行了大量实验,HR@10评价指标分别为0.705 2、0.490 2;NDCG@10评价指标分别为0.478 2、0.271 5。结果表明,该方法明显优于当前先进方法,证明了CRLRS对正负反馈同时进行建模以及加入对比辅助任务的必要性,并且具有更好的推荐性能。The application of deep reinforcement learning techniques in interactive recommendation systems has reached a high level of maturity.However,there is currently limited research dedicated to modeling there presentation of states.Existing works primarily focus on modeling state representations based on positive feedback sequences during user interactions.This approach results in the oversight of potential relationships existing within negative feedback sequences generated by users during interactions,as well as changes in user interests.Consequently,the recommendations produced by such systems tend to be one-sided.To address this gap,a novel recommendation system framework,named Contrastive Learning and Deep Reinforcement Learning-Based Recommender System(CRLRS),is proposed.CRLRS is designed to model state representations for both positive and negative feedback sequences generated during user interactions.Additionally,in order to mitigate data sparsity issues associated with positive feedback and address differences between fine-grained positive and negative feedback,a contrastive auxiliary task is incorporated.Extensive experiments were conducted on two real-world datasets,among which HR@10 The results of the evaluation indicators on the Movielens-100k and Movielens-1m datasets are 0.7052 and 0.4902,respectively;NDCG@10 The results of the evaluation indicators are 0.4782 and 0.2715.The comparison results show that our method is significantly better than the current state-ofthe-art methods,which proves the necessity of CRLRS modeling positive and negative feedback simultaneously and adding comparative auxiliary tasks,and has better recommendation performance.
关 键 词:深度强化学习 对比学习 推荐系统 正负反馈 状态表示
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.203