基于数字内容偏好的多标签分类应用  

Application of Multi-label Classification Based on Digital Content Preference

在线阅读下载全文

作  者:刘斌[1] 李笑 LIU Bin;LI Xiao(School of Electronic Information and Artificial Intelligence,Shaanxi University of Science&Technology,Xi’an 710021,China)

机构地区:[1]陕西科技大学电子信息与人工智能学院,陕西西安710021

出  处:《计算机与现代化》2021年第2期45-50,共6页Computer and Modernization

基  金:国家自然科学基金资助项目(61871260)。

摘  要:目前电信行业的数字内容研究主要是基于业务口径进行不同偏好的用户洞察,多以业务经验进行判断,不利于数字内容用户规模的发展扩大。为此,本文利用大流量客户的历史数据,基于多标签分类算法对数字内容偏好进行研究,得到各类别的潜在目标客户,最终通过营销推荐客户喜好内容,提高精准营销能力。首先以M电信公司用户的基础、消费属性等脱敏数据作为数据源,并获取近3个月视频、音乐、阅读活跃用户清单,人工进行活跃维度的标注,得到初始数据集;由于正负样本不均衡,故采用多次下采样的方法随机抽样得到3份样本数据,并使用CC、ML-KNN、Rakel D等6种算法进行对比实验验证;实验结果表明:采用Rakel D及ML-KNN多标签分类算法在数字内容用户偏好洞察方面有较好的预测能力,故采用ML-KNN作为Rakel D算法的基本分类器,即Rakel D_MLKNN方法,对正负样比例不同的数据集分别进行预测,效果均优于前6种已经存在的常用多标签分类算法及传统经验选型方法。At present,the research on digital content in telecom industry is mainly based on the user insight of different preferences based on business caliber,and most of them are based on business experience,which is not conducive to the development and expansion of the scale of digital content users.To this end,this paper makes use of the historical data of large-volume customers and studies the digital content preference based on multi-label classification algorithm,so as to obtain various potential target customers,and finally recommend customers’preferences through marketing to improve precision marketing ability.Firstly,desensitization data such as the basis and consumption attributes of M telecom users are taken as the data source,and the list of active users of video,music and reading in the last three months is obtained.The active dimension is manually annotated to obtain the initial data set.Because the positive and negative samples are not balanced,three samples are randomly sampled by multiple down-sampling method,and six algorithms including CC,ML-KNN and Rakel D are used for comparative experimental verification.The experimental results show that the Rakel D and ML-KNN multi-tag classification algorithms have better predictive ability in the perspective of user preference.Therefore,ML-KNN is adopted as the basic classifier of Rakel D algorithm,namely Rakel D_MLKNN method,to respectively predict the data sets with different positive and negative sample proportions,and the results are all better than the previous 6 existing common multi-label classification algorithms and traditional empirical selection methods.

关 键 词:数字内容偏好 多标签分类 CC算法 ML-KNN算法 RakelD算法 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象