显式融合词法和句法特征的抽取式机器阅读理解模型  

Extractive Machine Reading Comprehension Model with Explicitly Fused Lexical and Syntactic Features

在线阅读下载全文

作  者:闫维宏 李少博 单丽莉[2] 孙承杰[2] 刘秉权[2] YAN Wei-Hong;LI Shao-Bo;SHAN Li-Li;SUN Cheng-Jie;LIU Bing-Quan(State Key Laboratory of Communication Content Cognition,People’s Daily Online,Beijing 100733,China;Faculty of Computing,Harbin Institute of Technology,Harbin 150006,China)

机构地区:[1]人民网传播内容认知国家重点实验室,北京100733 [2]哈尔滨工业大学计算学部,哈尔滨150006

出  处:《计算机系统应用》2022年第9期352-359,共8页Computer Systems & Applications

基  金:国家自然科学基金(62176074)。

摘  要:预训练语言模型虽然能够为每个词提供优良的上下文表示特征,但却无法显式地给出词法和句法特征,而这些特征往往是理解整体语义的基础.鉴于此,本文通过显式地引入词法和句法特征,探究其对于预训练模型阅读理解能力的影响.首先,本文选用了词性标注和命名实体识别来提供词法特征,使用依存分析来提供句法特征,将二者与预训练模型输出的上下文表示相融合.随后,我们设计了基于注意力机制的自适应特征融合方法来融合不同类型特征.在抽取式机器阅读理解数据集CMRC2018上的实验表明,本文方法以极低的算力成本,利用显式引入的词法和句法等语言特征帮助模型在F和EM指标上分别取得0.37%和1.56%的提升.Language models obtained by pre-training unstructured text alone can provide excellent contextual representation features for each word, but cannot explicitly provide lexical and syntactic features, which are often the basis for understanding overall semantics. In this study, we investigate the impact of lexical and syntactic features on the reading comprehension ability of pre-trained models by introducing them explicitly. First, we utilize part of speech tagging and named entity recognition to provide lexical features and dependency parsing to provide syntactic features.These features are integrated with the contextual representation from the pre-trained model output. Then, we design an adaptive feature fusion method based on the attention mechanism to fuse different types of features. Experiments on the extractive machine reading comprehension dataset CMRC2018 show that our approach helps the model achieve 0.37%and 1.56% improvement in F1 and EM scores, respectively, by using explicitly introduced lexical and syntactic features at a very low computational cost.

关 键 词:机器阅读理解 词法特征 句法特征 深度学习 预训练模型 特征融合 注意力机制 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象