基于SBERT-Attention-LDA与ML-LSTM特征融合的烟草问句意图识别方法被引量：1

Tobacco Interrogative Intent Recognition Based on SBERT Attention LDA and ML LSTM Feature Fusion

作　　者：朱波黎魁邱兰黎博 ZHU Bo;LI Kui;QIU Lan;LI Bo(Faculty of Mechanical and Electrical Engineering,Kunming University of Science and Technology,Kunming 650504,China;Faculty of Mechanical and Electrical Engineering,Wuhan Engineering University,Wuhan 430205,China)

机构地区：[1]昆明理工大学机电工程学院,昆明650504 [2]武汉工程大学机电工程学院,武汉430205

出　　处：《农业机械学报》2024年第5期273-281,共9页Transactions of the Chinese Society for Agricultural Machinery

基　　金：中国烟草总公司云南省烟草公司重点项目(2021530000241012)。

摘　　要：针对烟草领域中问句意图识别存在的特征稀疏、术语繁多和捕捉文本内部的语义关联困难等问题,提出了一种基于SBERT-Attention-LDA(Sentence-bidirectional encoder representational from transformers-Attention mechanism-Latent dirichlet allocation)与ML-LSTM(Multi layers-Long short term memory)特征融合的问句意图识别方法。该方法首先基于SBERT预训练模型和Attention机制对烟草问句进行动态编码,转换为富含语义信息的特征向量,同时利用LDA模型建模出问句的主题向量,捕捉问句中的主题信息;然后通过更改后的模型级特征融合方法ML-LSTM获得具有更为完整、准确问句语义的联合特征表示;再使用3通道的卷积神经网络(Convolutional neural network,CNN)提取问句混合语义表示中隐藏特征,输入到全连接层和Softmax函数中实现对问句意图的分类。基于烟草行业权威网站上获取的数据集开展了实验验证,实验结果表明,所提方法相比其他几种深度学习结合注意力机制的方法精确率、召回率和F1值上有显著提升,与BERT和ERNIE(Enhanced representation through knowledge integration and embedding)-CNN模型相比提升明显,F1值分别提升2.07、2.88个百分点。Aiming at the problems of feature sparsity,terminology and difficulty in capturing semantic associations within the text in question intention recognition in the tobacco domain,a feature fusion method based on sentence-bidirectional encoder representational from transformers Attention mechanism latent dirichlet allocation(SBERT Attention LDA)and multi layers long short term memory(ML LSTM)feature fusion was proposed.The method first dynamically encoded the tobacco question based on the SBERT pre-training model combined with the Attention mechanism and converted it into semantic-rich feature vectors,and at the same time,the topic vector of the question was modelled by using the LDA model to capture the topic information in the question;and then the joint feature representation with more complete and accurate question semantics was obtained by using the modified model-level ML LSTM feature fusion method;and then the three-layer LSTM and ML LSTM feature fusion method was used to identify the intention of the question.Then a 3-channel convolutional neural network(CNN)was used to extract the hidden features in the hybrid semantic representation of the question and fed them into the fully connected layer and Softmax function to achieve the classification of the question intent.Compared with the enhanced representation through knowledge integration and embedding(BERT and ERNIE)CNN models,the improvement was obvious(the F1 values were improved by 2.07 percentage points and 2.88 percentage points,respectively),which supported the construction of the Q&A system for tobacco websites.

关键词：烟草问句分类自然语言处理特征融合自注意力机制

分类号：TP183[自动化与计算机技术—控制理论与控制工程] TP391[自动化与计算机技术—控制科学与工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于SBERT-Attention-LDA与ML-LSTM特征融合的烟草问句意图识别方法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于SBERT-Attention-LDA与ML-LSTM特征融合的烟草问句意图识别方法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于SBERT-Attention-LDA与ML-LSTM特征融合的烟草问句意图识别方法被引量：1