检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张洋 胡燕[1] Zhang Yang;Hu Yan(School of Computer Science&Technology,Wuhan University of Technology,Wuhan 430070,China)
机构地区:[1]武汉理工大学计算机科学与技术学院,武汉430070
出 处:《计算机应用研究》2021年第1期69-74,共6页Application Research of Computers
基 金:湖北省自然科学基金资助项目(2019CFC919)。
摘 要:相比于单一语言的短文本情感分类而言,混合语言由于其表达情感的单词语言不唯一,语法结构复杂,仅使用传统词嵌入的方法无法使分类器学到足够有用的特征,导致分类效果不佳。针对这些问题,提出一种融合字词特征的双通道复合模型。首先,针对数据集不平衡问题,提出一种基于Bert语义相似度的数据集欠采样算法;其次,构建双通道深度学习网络,分别将以字、词方式嵌入的原始数据通过两个通道送入CNN和带有注意力机制的LSTM组成的模块中进行多粒度特征提取;最后融合多通道的特征进行分类。在NLPCC2018任务1公布的混合语言五分类数据集上的实验表明,该模型的整体性能较目前有代表性的深度学习模型有进一步提高。Compared with the single language short-text sentiment classification,the code-switching short-text sentiment classification has more challenges to face up with because the word that expresses emotion is not unique and the sentence has complex grammatical structure,using traditional word embedding alone cannot make the classifier learn enough useful features,resulting in poor classification.This paper proposed a dual-channel deep learning model which integrated char and word features.Firstly,in order to solve the problem of imbalanced data set,it proposed a data undersampling algorithm based on Bert semantic similarity.Secondly,it constructed dual-channel deep learning network,the original data embedded in chars and words were sent to two different module composed of CNN and LSTM with attention mechanism through two channels for extracting multi-level features,and finally features from the two channels were fused for classification.The experimental results show that the overall performance of the proposed model is further improved than the current representative deep learning models on the code-switching five-category dataset published in NLPCC2018&task 1.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15