检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李金彪 侯进[1,2] 李晨[2,3] 陈子锐 何川[2,3] LI Jinbiao;HOU Jin;LI Chen;CHEN Zirui;HE Chuan(IPSOM Lab,School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,Sichuan,China;National Engineering Laboratory of Comprehensive Transportation Big Data Application Technology,Southwest Jiaotong University,Chengdu 611756,Sichuan,China;School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,Sichuan,China)
机构地区:[1]西南交通大学信息科学与技术学院智能感知智慧运维实验室,四川成都611756 [2]西南交通大学综合交通大数据应用技术国家工程实验室,四川成都611756 [3]西南交通大学计算机与人工智能学院,四川成都611756
出 处:《微电子学与计算机》2022年第6期41-50,共10页Microelectronics & Computer
基 金:四川省科技计划项目(2020SYSY0016)。
摘 要:针对现有文本分类算法处理中文数据时存在的分类精度低、参数量庞大、模型难训练等问题,对BERT算法进行了优化.BERT算法处理中文文本时无法提取词向量特征,为此提出了均匀词向量卷积模块AWC.通过在传统卷积神经网络中引入注意力机制来提取可靠词向量特征,再进一步获取到文本的局部特征,由此弥补了BERT模型无法提取词向量的缺点.BERT模型本身具有的自注意力网络可提取到文本的全局特征来突出全文的重点含义,与此同时在BERT算法中又引入了局部特征,通过将描述文本的局部特征以及全局特征按照重要程度进行融合,最终生成了更加丰富的文本信息.将融合后的特征输入softmax层得到模型的分类结果.平衡多头设计、层级参数共享机制、全连接层优化等方法的运用在保证算法准确度的前提下大大降低了模型参数量,最终形成了一种基于混合注意力机制的BERT-AWC轻量化文本分类算法.在多个公开数据集上的实验结果表明,相较于基准算法BERT,该算法在多个公开数据集上的预测精度均有1~5%的提升,而模型参数量仅为BERT的3.6%,达到了设计预期.Aiming at the problems of low classification accuracy,huge amount of parameters,and difficulty in training models when processing Chinese data in existing text classification algorithms,the BERT algorithm is optimized.The BERT algorithm cannot extract word vector features when processing Chinese text.For this reason,a uniform word vector convolution module AWC is proposed.By introducing the attention mechanism into the traditional convolutional neural network to extract reliable word vector features,and then further obtain the local features of the text,this makes up for the shortcomings of the BERT model that cannot extract word vectors.The self-attention network of the BERT model itself can extract the global features of the text to highlight the key meaning of the full text.At the same time,local features are introduced into the BERT algorithm,by describing the local features and global features of the text according to the degree of importance.The fusion finally generates richer text information.Input the fused features into the softmax layer to get the classification result of the model.The use of balanced multi-head design,hierarchical parameter sharing mechanism,fully connected layer optimization and other methods greatly reduces the amount of model parameters under the premise of ensuring the accuracy of the algorithm,and finally forms a BERT-AWC lightweight text classification based on a hybrid attention mechanism algorithm.The experimental results on multiple public data sets show that compared with the benchmark algorithm BERT,the prediction accuracy of the BERT-AWC algorithm on multiple public data sets has improved by 1~5%,and the model parameters are only BERT′s 3.6%,which met the design expectations.
关 键 词:文本分类 注意力机制 卷积神经网络 混合注意力机制
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249