基于图书特征及词典的豆瓣图书垃圾评论识别  被引量:1

Identification of Douban Book Spam Comments Based on Book Features and Dictionary

在线阅读下载全文

作  者:刘高军 印佳明 LIU Gao-jun;YIN Jia-ming(School of Computer,North China University of Technology,Beijing 100144,China)

机构地区:[1]北方工业大学计算机学院

出  处:《计算机技术与发展》2019年第11期107-112,共6页Computer Technology and Development

基  金:国家自然科学基金(61672040)

摘  要:随着互联网的普及和便利,现如今国内外点评网站和各类商务网站高速发展,各类评论信息正在不断影响着人们的生活。豆瓣网就是很知名的网络社区,越来越多互联网用户会在豆瓣网上发表对电影、图书和音乐等的评论,同时越来越多的人们会在看电影前、看书前或者是听音乐前看看豆瓣上的评分和评论去决定是否去看或听。所以此时垃圾评论的识别就至关重要,因为垃圾评论会影响人们对这个事物真实的看法。文中引入了语义分析、图书特征词典和垃圾评论词典。语义分析有利于检测垃圾评论附加功能,同时会使用权重比例过滤模型检测垃圾评论。实验结果表明,文中方法可以达到85.4%的准确率,能有效准确地识别垃圾评论。With the popularization and convenience of the Internet,comment sites and various business websites at home and abroad are developing at a high speed,and various kinds of commentary information are constantly affecting people’s lives.Douban is a well-known online community.More and more users will post comments on movies,books and music on Douban.At the same time,more and more people will look at the ratings and comments on Douban before watching movies,reading books or listening to music to decide whether to watch or listen.So the identification of spam comments is crucial,because spam comments will affect people’s true perception of this thing.We introduce semantic analysis,book feature dictionary and spam dictionary.Semantic analysis is beneficial to the additional function of spam comment detection,and it can use the weight proportional filter model to detect spam comments.The experiment shows that the proposed method can achieve 85.4%accuracy and can effectively and accurately identify spam comments.

关 键 词:互联网 豆瓣 图书评论 语义分析 垃圾评论检测 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象