基于Borderline-Smote算法改进的FastText中文情感极性分析  被引量:4

ANALYSIS OF CHINESE EMOTIONAL POLARITY BASED ON FASTTEXT COMBINED WITH BORDERLINE SMOTE ALGORITHM

在线阅读下载全文

作  者:潘正军 赵莲芬 袁丽娜[1] 王红勤[1] Pan Zhengjun;Zhao Lianfen;Yuan Li’na;Wang Hongqin(South China Institute of Software Engineering,Guangzhou University,Guangzhou 510990,Guangdong,China)

机构地区:[1]广州大学华软软件学院,广东广州510990

出  处:《计算机应用与软件》2021年第11期295-299,349,共6页Computer Applications and Software

基  金:广东省特色创新(自然科学类)重点科研项目(2020KTSCX214,2018KQNCX393);广州大学华软软件学院校内科研项目(ky201911,JXTD201901)。

摘  要:针对单一的FastText模型在不平衡中文语料中的情感极性分析效果不好,以及传统Jieba分词对广领域中文文本适应性不强,数据倾斜导致中文情感极性分析的准确率和召回率产生波动等问题,提出一种基于Borderline-Smote算法改进的FastText中文情感极性分析,通过过采样Borderline-Smote和pkuseg中文分词等预处理方式分别解决分类中数据倾斜、涉及领域广的问题,再与FastText结合进行中文情感极性分析。实验结果表明,该模型在中文情感极性分析中的准确率得到了一定的提高。In view of the poor effect of single fasttext model in emotional polarity analysis of unbalanced Chinese corpus, as well as the weak adaptability of traditional Jieba segmentation to Chinese Texts in a wide range of fields, and the fluctuation of accuracy and recall rate of Chinese emotional polarity analysis caused by data skew, a fasttext emotional polarity analysis based on the improvement of borderline smote algorithm was proposed, the preprocessing methods such as Borderline-Smote and pkuseg Chinese word segmentation respectively solve the problems of data skew and wide field involved in classification, and then fasttext was combined to analyze Chinese emotion polarity. Through the comparison of experimental results, it is proved that the accuracy of the model in Chinese emotion polarity analysis has been improved to a certain extent.

关 键 词:机器学习 中文分词 Borderline-Smote FastText 情感极性分析 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象