一种核的上下文多臂赌博机推荐算法  被引量:3

A kernel contextual bandit recommendation algorithm

在线阅读下载全文

作  者:王鼎 门昌骞[1] 王文剑[1,2] WANG Ding;MEN Changqian;WANG Wenjian(College of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China)

机构地区:[1]山西大学计算机与信息技术学院,山西太原030006 [2]山西大学计算智能与中文信息处理教育部重点实验室,山西太原030006

出  处:《智能系统学报》2022年第3期625-633,共9页CAAI Transactions on Intelligent Systems

基  金:国家自然科学基金项目(62076154,U1805263);中央引导地方科技发展资金项目(YDZX20201400001224);山西省自然科学基金项目(201901D111030);山西省国际科技合作重点研发计划项目(201903D421050).

摘  要:个性化推荐服务在当今互联网时代越来越重要,但是传统推荐算法不适应一些高度变化场景。将线性上下文多臂赌博机算法(linear upper confidence bound,LinUCB)应用于个性化推荐可以有效改善传统推荐算法存在的问题,但遗憾的是准确率并不是很高。本文针对LinUCB算法推荐准确率不高这一问题,提出了一种改进算法K-UCB(kernel upper confidence bound)。该算法突破了LinUCB算法中不合理的线性假设前提,利用核方法拟合预测收益与上下文间的非线性关系,得到了一种新的在非线性数据下计算预测收益置信区间上界的方法,以解决推荐过程中的探索–利用困境。实验表明,本文提出的K-UCB算法相比其他基于多臂赌博机推荐算法有更高的点击率(click-through rate,CTR),能更好地适应变化场景下个性化推荐的需求。Personalized recommendations are becoming increasingly significant in the Internet era;however,conventional recommendation algorithms cannot adapt to the highly changing scenarios.Applying the linear contextual bandit algorithm(linear upper confidence bound,LinUCB)to personalized recommendations can effectively overcome the limitations of conventional recommendation algorithms;however,the accuracy is not sufficiently high.Herein,an improved kernel upper confidence bound(K-UCB)algorithm is proposed to handle the insufficient recommended accuracy of the LinUCB algorithm.The proposed algorithm breaks through the unreasonable linear hypothesis of the LinUCB algorithm and uses the kernel method to fit the nonlinear relation between the expected reward and context.A new method for calculating the upper confidence bound of estimate rewards under nonlinear data is established to the exploration–exploitation balance in the recommendation process.Experiments show that the proposed K-UCB algorithm exhibits higher recommended accuracy than other recommendation algorithms based on multiarmed bandits and can better adapt to the need for personalized recommendations in changing scenarios.

关 键 词:个性化推荐 变化场景 多臂赌博机 线性上下文多臂赌博机 核方法 点击率 非线性 探索–利用困境 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象