基于RNN和GBDT融合方法的用户活跃度预测  

Retention Rate Prediction Based on GBDT and RNN Fusion Method

在线阅读下载全文

作  者:盛爱林 左劼[1] 孙频捷[2] SHENG Ai-lin;ZUO Jie;SUN Pin-jie(College of Computer Science,Sichuan University,Chengdu 610065;Shanghai University of Political Science and Law,Shanghai 200000)

机构地区:[1]四川大学计算机学院,成都610065 [2]上海政法学院,上海200000

出  处:《现代计算机》2020年第3期8-11,33,共5页Modern Computer

基  金:国家重点研发计划项目

摘  要:在百度举办的WSDM Cup用户留存率预测比赛中,比赛的主要任务为根据用户在好看视频App一天当中的交互数据来预测下一天用户是否会继续使用App,该任务为典型的二分类类型。在新用户下载App并使用一段时间过后,一些用户会在下一天继续登录和使用App,这种用户也叫回归用户;而另外一部分用户可能会仅仅在下载的当天探索使用,而在此之后的很长时间不会继续使用。设计一种实用的机器学习方法来解决这一难题,包括特征工程、LightGBM、CatBoost等GBDT梯度提升树、ManyToMany结构的RNN和机器学习模型Stacking方法。希望能找到有效提高用户留存率预测正确率的方法,以及深度挖掘影响用户留存率的关键因素,在该比赛任务中,我们所设计的方案,最终以0.7671的成绩获得第二名。In practical application,the situation,in which it requires a listing of the data in the order of the size of the keywords without changing theorder of the original data,is an often-met case.The original classic sorting algorithm cannot be used directed to solve this kind of problem.By researching into the selective sorting algorithm,puts forward an algorithm on the basis of sorting the data without changing the positionsof the data.It also gives a dynamic demonstration of the realization procedure of this algorithm by applying the C language programming.WSDM Cup’s Retention Rate Prediction Challenge refers to a binary classification task to prediction whether a user will reuse the app aftertheir first day’s Experience of Baidu Hao Kan App.On one hand,some new users may download the app to browse and play the video forhours.Some new users can use the app again to watch videos the next day(call as retained users);on the other hand,some would no longeruse the app after they download it for a while.Designs a practical machine learning ways to tackle such a challenge,including feature engi⁃neering,Decision Trees like LightGBM,CatBoost,RNN’s ManyToMany,and stacking of learning models.We want to find the methods toincrease the percentage of user retention and the elements which affect user retention.In the competition,we eventually received the sec⁃ond place with an evaluation score of 0.7671.

关 键 词:特征工程 GBDT RNN SWA 

分 类 号:TP311.56[自动化与计算机技术—计算机软件与理论] TP181[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象