交叉采样与结构情感融合的跨语言情感分析  

Cross-sampling and structural sentiment fusion based cross-lingual sentiment analysis

在线阅读下载全文

作  者:陈强[1] 何炎祥[1,2] 刘续乐 刘健博[1] 

机构地区:[1]武汉大学计算机学院,湖北武汉430072 [2]武汉大学软件工程国家重点实验室,湖北武汉430072

出  处:《武汉大学学报(工学版)》2017年第2期311-320,共10页Engineering Journal of Wuhan University

基  金:国家自然科学基金项目(编号:61472290;61472291)

摘  要:基于协同学习,提出一种基于交叉采样与结构情感信息的跨语言情感分析交互学习模型.首先,通过启发式识别方法抽取文本中的情感表达作为结构情感特征,将其融合到传统的n-gram特征空间中,形成情感表征性更强的特征空间;其次,在传统协同学习的框架基础上,提出一种交叉采样策略对2种语言视图中的非标注数据的情感知识交互迁移,从而实现将源语言与目标语言进行高效融合学习;最终获得具有更高性能的目标语言情感分类器.实验结果表明:相较于传统跨语言情感分析模型,基于交叉采样和结构情感融合的半监督学习框架可以高效地利用少量源语言标注数据挖掘出大量的未标注数据中的情感知识,从而帮助目标语言学习出更优质的情感分类器.Based on co-training, we propose a mutual-learning framework for cross-lingual sentiment analy- sis based on a cross-sampling strategy and the structural sentimental information. Firstly, we use a heuris- tic method to extract sentimental expressions from training data and then we join them into n-gram fea- tures to form a highly sentiment-expressive feature space. Subsequently, we integrate into traditional co- training framework with a cross-sampling strategy to mutually learn the sentimental knowledge from unla- beled data in the both two languages. During the learning, sentimental knowledge from different languages are mutually fused to each other language. Finally, we can learn a sentiment classifier in the source lan- guage with our proposed framework. The experimental results show that our proposed method can effi- ciently leverage a small scale of a labeled data and massive unlabeled data in the both languages to get a more dependable and high-quality sentiment classifier in the target language comparing to existing cross- lingual sentiment analysis(CLSA) methods.

关 键 词:跨语言情感分析 交叉采样 结构情感信息 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象