检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:邢义男 张娜娜[2] XING Yinan;ZHANG Nana(College of Information Technology,Shanghai Ocean University,Shanghai 201306,China;College of Information Technology,Shanghai Jian Qiao University,Shanghai 201306,China)
机构地区:[1]上海海洋大学信息学院,上海201306 [2]上海建桥学院信息技术学院,上海201306
出 处:《计算机工程与应用》2022年第23期205-213,共9页Computer Engineering and Applications
基 金:上海市教育委员会“晨光计划”基金(AASH1702)。
摘 要:问句意图分类作为问答系统的关键任务之一,其能否正确分类对于后续的问答任务十分重要。针对民事纠纷问句中存在的长短不一、特征分散、种类繁多的问题,以及传统卷积神经网络和词向量的不足,为了准确获取民事纠纷问句意图类别,构建了结合BERT与多尺度CNN的民事纠纷问句意图分类模型。对民事纠纷问句数据集进行预处理;采用BERT预训练模型对问句进行语义编码和语义补充;使用4个不同的卷积通道进行卷积运算,每个卷积通道由不同尺度的卷积核进行卷积,将4种不同尺度的问句特征进行拼接得到多层次问句特征信息;通过全连接层和Softmax对问句进行分类。实验结果表明,所提出的模型在中文民事纠纷问句数据集上取得了87.41%的准确率,召回率、F1值分别达到了87.52%、87.39%,能够有效解决民事纠纷问句意图分类的问题。As one of the key tasks of question answering system,the classification of question intention is very important for the following question answering tasks.Aiming at the problems of different lengths,scattered features and various kinds of questions in civil disputes,as well as the shortcomings of traditional convolutional neural network and word vector,in order to accurately obtain the intention category of questions in civil disputes,intent classification of questions in civil disputes combining BERT and multi-scale CNN Model is constructed.Firstly,the data set of questions of civil dispute is preprocessed.Then,the BERT pre-training model is used to encode and supplement the semantic information of the questions.Then,four different convolution channels are used for convolution operation,and each convolution channel is convolved by convolution kernels of different scales.The multi-level question feature information is obtained by combining four different scale question features.Finally,the questions are classified by full connection layer and Softmax layer.The experimental results show that the proposed model achieves 87.41%accuracy on the data set of civil dispute questions,and the recall rate and F1 value reach 87.52%and 87.39%,respectively,which can effectively solve the problem of intention classification of civil dispute questions.
关 键 词:民事纠纷问句意图分类 BERT 多尺度CNN 自然语言问句理解
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28