检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王泽辰 王树鹏[1] 孙立远 张磊[1] 王勇[1] 郝冰川 WANG Zechen;WANG Shupeng;SUN Liyuan;ZHANG Lei;WANG Yong;HAO Bingchuan(Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100193,China;National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100085,China)
机构地区:[1]中国科学院信息工程研究所,北京100193 [2]国家计算机网络应急技术处理协调中心,北京100085
出 处:《北京航空航天大学学报》2022年第2期301-310,共10页Journal of Beijing University of Aeronautics and Astronautics
基 金:国家自然科学基金(61931019)。
摘 要:微博平台数据中含有大量反映用户情感喜恶的信息,对于涉及博文倾向性分析的应用尤为重要。现有的分析方法往往聚焦在博文情感的简单分类上,无法分析特定类型实体的微博倾向性。为解决微博倾向性分析问题,实现博文立场判定,采用半监督学习的方法,通过协同训练和主动学习,训练实体识别模型,并构建基于主成分分析的情感规则,提取句子的主成分,将口语化的文本规范化为指定格式。再利用指向性实体的正负面性、情感词的褒贬义及情感词充当的句子成分,实现情感分类的更深层次分析——立场判定。针对实际问题进行立场判定实验,在不同规模数据集上的自对比实验和他比实验显示,随着标注实体的博文数量增加,模型对博文立场判断的正确率持续提升,而且所提方法判断博文立场的正确率显著高于对比方法,相较已有研究方法分别提高了2.79%和10.00%。Weibo contains a large number of information reflecting users'likes and dislikes,which is important for popular trend judgment,precision marketing,public opinion monitoring,etc.However,the existing methods tend to focus on the classification of Weibo sentiment.In order to solve the problem of Weibo tendentiousness analysis and position detection,we employ semisupervised learning method,through collaborative training and active learning.We train entity recognition models and combine deep learning with emotional rules.Moreover,the sentiment rules based on principal component analysis are constructed to extract the main components of sentences,normalize the spoken text into the specified format.Then we use the positive and negative aspects of directional entities,the positive and negative meanings of emotional words,and the sentence components of emotional words to judge the tendency of blog posts,and conduct deeper analysis on position classification.Finally,the self comparison experiment and other comparison experiment on different scale data sets show that with the increase of the number of blog posts of labeled entities,the accuracy of the model continues to improve,and the accuracy of this method is significantly higher than the comparison method,which is 2.79%and 10.00%higher than the existing research methods.
关 键 词:情感分析 立场判定 半监督学习 倾向性 情感规则 协同训练 主动学习
分 类 号:P391[天文地球—地球物理学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147