检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王钦晨 段利国[1,2] 王君山 张昊妍 郜浩 WANG Qin-chen;DUAN Li-guo;WANG Jun-shan;ZHANG Hao-yan;GAO Hao(College of Computer Science and Technology(College of Data Science),Taiyuan University of Technology,Taiyuan 030600;School of Information Technology Innovation,Shanxi University of Electronic Science and Technology,Linfen 041000;Network Security Corps of Beijing Municipal Public Security Bureau,Beijing 100740,China)
机构地区:[1]太原理工大学计算机科学与数据学院(大数据学院),山西太原030600 [2]山西电子科技学院信创产业学院,山西临汾041000 [3]北京市公安局网络安全保卫总队,北京100740
出 处:《计算机工程与科学》2024年第7期1321-1330,共10页Computer Engineering & Science
基 金:山西省自然科学基金(202203021221234)。
摘 要:短文本语义匹配是自然语言处理领域中的一个核心问题,可广泛应用于自动问答、搜索引擎等领域。过去的工作大多只考虑文本之间的相似部分,忽略了文本之间的差异部分,从而使模型无法充分利用到决定文本之间是否匹配的关键信息。针对上述问题,提出一种基于BERT字句向量与差异注意力的短文本语义匹配策略,利用BERT对句子对进行向量化表示,使用BiLSTM并引入多头差异注意力机制获取当前字向量与文本全局语义信息之间表征意图差异的注意力权重,结合一维卷积神经网络对句子对的语义特征向量进行降维,最后拼接字句向量并送入全连接层计算出2个句子之间的语义匹配度。通过在LCQMC和BQ Corpus数据集上的实验表明,该策略可以有效提取文本语义差异信息,从而使模型表现出更好的效果。Short text semantic matching is a core issue in the field of natural language processing,which can be widely used in automatic question answering,search engines,and other fields.In the past,most of the work only considered the similar parts between texts,while ignoring the different parts between texts,making the model unable to fully utilize the key information to determine whether texts match.In response to the above issues,this paper proposes a short text semantic matching strategy based on BERT sentence vectors and differential attention.BERT is used to vectorize sentence pairs,BiLSTM is used,and a multi-header differential attention mechanism is introduced to obtain attention weights that represent intention differences between the current word vector and the global semantic information of the text.A one-dimensional convolutional neural network is used to reduce the dimension of the semantic feature vectors of the sentence pairs,Finally,the word sentence vector is spliced and sent to the full connection layer to calculate the semantic matching degree between the two sentences.Experiments on LCQMC and BQ datasets show that this strategy can effectively extract text semantic difference information,thereby enabling the model to display better results.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.22.242.110