检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:朱波 黎魁 邱兰 黎博 ZHU Bo;LI Kui;QIU Lan;LI Bo(Faculty of Mechanical and Electrical Engineering,Kunming University of Science and Technology,Kunming 650504,China;Faculty of Mechanical and Electrical Engineering,Wuhan Engineering University,Wuhan 430205,China)
机构地区:[1]昆明理工大学机电工程学院,昆明650504 [2]武汉工程大学机电工程学院,武汉430205
出 处:《农业机械学报》2024年第5期273-281,共9页Transactions of the Chinese Society for Agricultural Machinery
基 金:中国烟草总公司云南省烟草公司重点项目(2021530000241012)。
摘 要:针对烟草领域中问句意图识别存在的特征稀疏、术语繁多和捕捉文本内部的语义关联困难等问题,提出了一种基于SBERT-Attention-LDA(Sentence-bidirectional encoder representational from transformers-Attention mechanism-Latent dirichlet allocation)与ML-LSTM(Multi layers-Long short term memory)特征融合的问句意图识别方法。该方法首先基于SBERT预训练模型和Attention机制对烟草问句进行动态编码,转换为富含语义信息的特征向量,同时利用LDA模型建模出问句的主题向量,捕捉问句中的主题信息;然后通过更改后的模型级特征融合方法ML-LSTM获得具有更为完整、准确问句语义的联合特征表示;再使用3通道的卷积神经网络(Convolutional neural network,CNN)提取问句混合语义表示中隐藏特征,输入到全连接层和Softmax函数中实现对问句意图的分类。基于烟草行业权威网站上获取的数据集开展了实验验证,实验结果表明,所提方法相比其他几种深度学习结合注意力机制的方法精确率、召回率和F1值上有显著提升,与BERT和ERNIE(Enhanced representation through knowledge integration and embedding)-CNN模型相比提升明显,F1值分别提升2.07、2.88个百分点。Aiming at the problems of feature sparsity,terminology and difficulty in capturing semantic associations within the text in question intention recognition in the tobacco domain,a feature fusion method based on sentence-bidirectional encoder representational from transformers Attention mechanism latent dirichlet allocation(SBERT Attention LDA)and multi layers long short term memory(ML LSTM)feature fusion was proposed.The method first dynamically encoded the tobacco question based on the SBERT pre-training model combined with the Attention mechanism and converted it into semantic-rich feature vectors,and at the same time,the topic vector of the question was modelled by using the LDA model to capture the topic information in the question;and then the joint feature representation with more complete and accurate question semantics was obtained by using the modified model-level ML LSTM feature fusion method;and then the three-layer LSTM and ML LSTM feature fusion method was used to identify the intention of the question.Then a 3-channel convolutional neural network(CNN)was used to extract the hidden features in the hybrid semantic representation of the question and fed them into the fully connected layer and Softmax function to achieve the classification of the question intent.Compared with the enhanced representation through knowledge integration and embedding(BERT and ERNIE)CNN models,the improvement was obvious(the F1 values were improved by 2.07 percentage points and 2.88 percentage points,respectively),which supported the construction of the Q&A system for tobacco websites.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.218.146.21