检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:代翔[1] 孙海春 牛硕 朱容辰 DAI Xiang;SUN Haichun;NIU Shuo;ZHU Rongchen(School of Information and Network Security,People’s Public Security University of China,Beijing 100038,China)
机构地区:[1]中国人民公安大学信息网络安全学院,北京100038
出 处:《信息网络安全》2021年第12期102-108,共7页Netinfo Security
基 金:国家自然科学基金[41971367];国家重点研发计划[2017YFC0803700];公安部技术研究计划[2020JSYJC22ok]。
摘 要:问答匹配任务是问答系统关键技术之一,针对传统问答匹配模型对中文词向量表示不够精确、文本间交互特征提取不充分的问题,文章提出基于注意力的双向编码表征问答匹配模型。在中文向量表征上采用迁移学习引入预训练中文BERT模型参数,并在训练集上进一步微调获取最优参数,通过BERT模型对中文字向量进行表示,从而解决传统词向量模型在中文词汇上表征能力不足的问题。在文本交互层面,首先利用互注意力机制提取问题与答案的交互特征,并将生成的交互特征与注意力机制的输入向量形成特征组合;然后使用双向长短期记忆网络进行推理组合并降低特征维度,融入上下文语义信息;最后在中文法律数据集上进行测试。测试结果表明,该模型优于多项传统模型,与ESIM相比,在Top-1准确率上提高了3.55%,在MAP上提高了5.21%,在MRR上提高了4.05%。Question and answer matching task is one of the key technologies of question and answer system. Focusing on the problems that the traditional question and answer matching model is not accurate enough in the representation of Chinese word vector and insufficient extraction of interactive features between texts, a bi-directional encoder representation algorithm based on attention is proposed. In Chinese vector representation, transfer learning is used to introduce the pretrained Chinese BERT model parameters, and further finetune the training set to obtain the optimal parameters. The Chinese character vector is represented by the BERT model, so as to solve the problem of insufficient representation ability of the traditional word vector model in Chinese vocabulary. At the text interaction layer, the interactive features of questions and answers are extracted by using the mutual attention mechanism, and the generated interactive features are combined with the input vector of the attention mechanism to form a feature combination. Then BiLSTM is used for reasoning combination, reducing the feature dimension and integrating the context semantic information. Finally, it is tested on the Chinese legal data set. The experimental results show that the model is better than many traditional models. Compared with ESIM, it improves the accuracy of Top-1 by 3.55%, MAP by 5.21% and MRR by 4.05%.
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229