基于Multi-Aspect的融合网络用户画像生成方法  被引量:3

A User Profile Generation Method Based on Multi-Aspect Converged Network

在线阅读下载全文

作  者:苗宇 金醒男 杜永萍 MIAO Yu;JIN Xing-nan;DU Yong-ping(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)

机构地区:[1]北京工业大学信息学部,北京100124

出  处:《计算机技术与发展》2022年第8期20-25,共6页Computer Technology and Development

基  金:北京市自然科学基金(4212013)。

摘  要:在大数据时代,用户画像对于企业了解并获取目标用户的重要性日益提升,但基于统计的用户画像方法无法处理非结构化的文本数据,而传统的基于模型的用户画像方法亦无法从多角度深层次提取用户特征。为实现更加全面且精准的用户属性预测,该文提出一种基于多层级特征提取的融合网络用户画像生成方法,通过对用户原始文本关键词的提取和排序,分别生成基于top 2关键词的子句表示和top N关键词的词向量,并结合循环神经网络和注意力机制,构建多层次用户特征提取的分类模型,利用原始用户数据进行用户属性预测。在搜狗用户搜索文本数据集上的实验表明,文中模型较其他基线模型在分类准确率上显著提升,达到0.73,通过消融实验进一步表明各个模块均为有效提取用户特征从而提升分类准确率发挥了重要作用。In the era of big data,user profile is becoming more and more important for enterprises to understand and obtain target users,but the user profile generation method based on statistics cannot deal with unstructured text data,and the traditional model-based user profile generation method cannot deeply extract user characteristics from multiple angles.In order to achieve more comprehensive and accurate user attribute prediction,we propose a fusion network user profile generation method based on multi-level feature extraction.By extracting and sorting the user’s original text keywords,the clause representation based on top 2 keywords and the word vector of top N keywords are generated respectively.Combined with recurrent neural network and attention mechanism,the classification model of multi-level user feature extraction is constructed,and the user attribute prediction is carried out by using the original user data.The experiment on Sogou user search text dataset shows that the classification accuracy of the proposed model is significantly improved compared with other baseline models,reaching 0.73.The ablation experiments further show that each module plays an important role in effectively extracting user features,so as to improve the classification accuracy.

关 键 词:用户画像 多层级特征提取 关键词抽取 循环神经网络 注意力机制 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象