检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韩晴晴 张艳梅[1] 牛娃[1] HAN Qing-qing;ZHANG Yan-mei;NIU Wa(Information School,Central University of Finance and Economics,Beijing 102206,China)
机构地区:[1]中央财经大学信息学院
出 处:《计算机科学》2019年第11期202-208,共7页Computer Science
基 金:国家自然科学基金项目(61602536,61773415);北京市社会科学基金重点项目(16YJA001)资助
摘 要:在迅速发展的互联网时代,微博产生了大量的信息,但是在微博话题等地带存在着较多水军,水军在一定程度上影响了普通用户了解某人或者某事的真实情况。因此,为了高效、准确地识别水军,针对水军样本数量少、非水军样本数量庞大等问题,综合考虑使用半监督协同训练算法。该算法通过研究微博用户的多个特征并对其进行综合分析,重新定义了6个属性特征值,包括账户关注度、每日发表微博数、微博影响力等。依据算法的特点,将6个属性特征值分为两个属性集,每个属性集对应一个视图,每个视图利用Scikit-Learn机器学习库中的7种分类方法训练出分类器,以对微博用户进行水军识别,最后在爬取的微博用户数据集上进行实验。实验结果表明,两个视图在分别使用朴素贝叶斯算法、逻辑回归算法训练分类器时,分类结果的准确率、召回率、精度和F1-measure值都较高。因此,综合分析微博用户特征并且使用符合实际情况的半监督协同训练算法,能够准确、高效、快速地识别微博水军。In the fast-developing Internet era,Weibo brings a large amount of information,but there exists water army in Weibo topic.To a certain extent,the water army affects ordinary users to understand the real situation.In order to efficiently and accurately identify the water army,the semi-supervised collaborative training algorithm is considered comprehensively in view of the small number of water military samples and the large number of non-water military samples.By studying and analyzing multiple characteristics of Weibo users,the proposed algorithm redefines six attribute feature values,such as account attention,daily microblog number,and microblog influence.According to the characteristics of the algorithm,the six attribute feature values are divided into two attribute sets,each attribute set corresponds to one view,and each view uses seven classification methods in the Scikit-Learn machine learning library to train the classifier to identify the water army.Finally,experiments are conducted on dataset.The results show that the accuracy,recall rate,accuracy and F1-measure value of the classification results are higher when the two views use the naive Bayes algorithm and the logistic regression algorithm to train the classifier.Therefore,comprehensive analysis of Weibo user cha-racteristics and the use of semi-supervised collaborative training algorithms in line with the actual situation can accurately,efficiently and quickly identify Weibo water army.
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28