检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李莉[1,2] 张之欣 王小龙 LI Li;ZHANG Zhi-xin;WANG Xiao-long(School of Control and Computer Engineering,North China Electric Power University,Baoding 071003,China;Hebei Key Laboratory of Knowledge Computing for Energy&Power,Baoding 071003,China)
机构地区:[1]华北电力大学控制与计算机工程学院,保定071003 [2]河北省能源电力知识计算重点实验室,保定071003
出 处:《科学技术与工程》2025年第2期649-656,共8页Science Technology and Engineering
摘 要:针对大型预训练语言模型在处理新闻标题时,面临参数规模庞大、无法高效利用上下文语意特征以及循环卷积神经网络对初始输入元素重要性忽视的问题,提出了一种融合混合专家模型(mixture-of-expert,MoE)的ERNIE与注意力机制的循环卷积神经网络(recurrent convolutional neural networks,RCNN)的新闻标题分类方法。首先,借助MoE改进ERNIE技术进行文本编码,随后利用注意力RCNN在保留文本词序和特征的基础上进行分类。为提高分类能力,通过计算输入的融合上下文权重对RCNN进行改进。在计算MoE中各个专家权重的过程中,选择Gumbel_Softmax作为新型的门控函数以改进传统的Softmax函数,从而更好地控制平滑程度。根据实验结果,发现相较于传统的分类方法,本文研究提出的分类方法展现出显著优势,极大地减少了参数数量。在此基础上,F_(1)相较于传统模型提升了0.51%。经过消融实验的验证,该分类方法在分类任务上的可行性得到了证实。Aiming at the problems that the large-scale pre-training language model faces when dealing with news headlines,such as huge parameters,inefficient use of contextual semantic features and circular convolution neural network’s neglect of the importance of initial input elements,a news headline classification method that combines ERNIE(enhanced representation through knowledge integration)of mixture-of-expert model and recurrent convolution neural network with attention mechanism were proposed.Firstly,the text was encoded with the help of MoE’s improved ERNIE technology,and then the text was classified with attention RCNN(recurrent convolutional neural networks)on the basis of preserving the word order and characteristics of the text.In order to improve the classification ability,RCNN was improved by calculating the input fusion context weight.In the process of calculating the weights of experts in MoE,Gumbel-Softmax was selected as a new gating function to improve the traditional Softmax function,so as to better control the smoothness.According to the experimental results,it is found that compared with the traditional classification methods,the classification method proposed in this study shows significant advantages and greatly reduces the number of parameters.On this basis,the F_(1) value is increased by 0.51%compared with the traditional model.After the ablation experiment,the feasibility of this classification method in the classification task has been confirmed.
关 键 词:混合专家系统 知识增强语义表示模型 注意力机制 循环卷积神经网络 文本分类
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.217.218.162