检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李志欣[1,2] 兰丹媚 张灿龙 唐素勤[1,2] LI Zhixin;LAN Danmei;ZHANG Canlong;TANG Suqin(Guangxi Key Lab of Multi-source Information Mining and Security,Guangxi Normal University,Guilin,Guangxi 541004,China;Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing,Guilin,Guangxi 541004,China)
机构地区:[1]广西师范大学广西多源信息挖掘与安全重点实验室,广西桂林541004 [2]广西区域多源信息集成与智能处理协同创新中心,广西桂林541004
出 处:《计算机工程》2018年第7期212-218,共7页Computer Engineering
基 金:国家自然科学基金(61663004;61363035;61365009);广西自然科学基金(2016GXNSFAA380146;2017GXNSFAA198365);广西多源信息挖掘与安全重点实验室主任基金(16-A-03-02);广西学位与研究生教育改革专项课题(JGY2015031)
摘 要:微博上大量的垃圾评论对个人、社会,甚至是对国家都会造成不良影响。为对微博中的垃圾评论进行识别,提出基于协同训练的微博垃圾评论识别方法。定义一种基于规则的识别方法过滤出显式垃圾评论,剩余的评论归为相关评论,构建AdaBoost分类器和支持向量机分类器,通过Co-Training算法进行协同训练,判断其是否为垃圾评论,以提高分类精度,节省样本标注工作。实验结果表明,与基于相似度计算的垃圾评论识别方法、基于评论多特征的垃圾评论识别方法相比,该方法具有较好的识别效果。A large amount of spam comments on microblogging will have an adverse effect on individuals, society, and even the country. In order to identify junk comments in microblogging and reduce junk comments,a microblogging junk comment review method based on collaborative training is proposed. Define a rule-based recognition method to filter out explicit spam comments. The remaining comments are categorized as related comments. The AdaBoost classifier and Support Vector Machine( SVM) classifier are constructed. The Co-Training algorithm is used for collaborative training to determine whether it is a spam comment or not, classification accuracy, saving sample labeling work. Experimental results show that compared with the spam comment recognition method based on similarity calculation and the multi-features comment spam recognition method,this method has a better recognition effect.
关 键 词:微博垃圾评论 协同训练 同义词词林 支持向量机 相似度计算
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30