基于爬虫技术的网络负面情绪挖掘系统设计与实现  被引量:15

DESIGN AND IMPLEMENTATION OF A WEB CRAWLING-BASED NEGATIVE EMOTION MINING SYSTEM

在线阅读下载全文

作  者:彭纪奔 吴林[2] 陈贤[3] 黄雷君[4] 

机构地区:[1]温州大学物理与电子信息工程学院,浙江温州325035 [2]温州大学教师教育学院,浙江温州325035 [3]温州大学机电工程学院,浙江温州325035 [4]浙江农林大学信息工程学院,浙江临安311300

出  处:《计算机应用与软件》2016年第10期9-13,71,共6页Computer Applications and Software

基  金:浙江省新苗人才计划项目(2014R424018)

摘  要:社交网络的兴起为人们提供了一个新的情感空间,但是在网络中出现心理健康问题的人们通常得不到应有的关注和帮助,甚至受到其他网民的恶意攻击。为便于在网络空间向需要的人群提供及时有效的心理辅导和救助,提出一个基于爬虫技术的网络负面情绪挖掘系统CyberCare。在Scrapy爬虫框架下,对目标网络进行周期性的自动抓取,对网页内容的负面情绪进行度量,并为心理工作者的及时介入提供接口。针对国内数个知名网站的实验结果显示,CyberCare能够将心理工作者的关注范围缩小到网站新帖的千分之一左右,显著提高了工作效率。对于情感类特定版块,实验结果的精度和召回率分别达到60%和80%,显示了该系统的有效性。The rise of social networks provides people a new emotional space, but those who show psychological health problems online areusually lack of due attention and help, and sometimes even become the victims of Internet violence. To facilitate offering psychologicalcounselling and assistance to those people in need in cyberspace timely and effectively, we propose a crawling-based web negative emotionmining system called CyberCare. Adopting the Scrapy open-source crawling framework, CyberCare periodically and automatically crawls targetwebsites and measures the negative emotion of crawled web pages. The system also provides an interface to voluntary psychologists for timelyintervention. Results of experiments targeted at several known social network websites show that CyberCare can narrow the concerning range ofpsychologists to one thousandth of all the new posts on websites, thus significantly improves work efficiency. For specific section of emotionalclass, the precision and recall of experimental results reach approximately 6 0 % and 8 0 % respectively, this shows the effectiveness ofthe system.

关 键 词:网络爬虫 情绪挖掘 负面情绪 正则表达式 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象