基于知识图谱的多数据集成抽取方法仿真  被引量:2

Simulation of Integrated Extraction Method for Multiple Data Based on Knowledge Graph

在线阅读下载全文

作  者:何芳州[1,2,3] 王祉淇 HE Fang-zhou;WANG Zhi-qi(Criminal Investigation Police University of China,Liaoning Shenyang 110854,China;Shenyang institute of computing technology,Chinese academy of sciences 110168,China;University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区:[1]中国刑事警察学院,辽宁沈阳100854 [2]中国科学院沈阳计算机技术研究所,辽宁省沈阳市110168 [3]中国科学院大学,北京100049

出  处:《计算机仿真》2023年第12期422-427,共6页Computer Simulation

基  金:公安理论及软科学研究计划项目(RKX20201033);辽宁省社会科学规划基金项目(L21ASH004);安全防范技术与风险评估公安部重点实验室项目(2021KFKTJC01);智能警务四川省重点实验室项目(ZNJW2022KFMS002);上海市刑事科学技术研究院现场物证重点实验室开放课题(2019XCWZK06、2020XCWZK02、2020XCWZK03);辽宁经济社会发展立项课题(2022lslybkt-039);公安学科基础理论研究创新计划项目(2022XKGJ0107)。

摘  要:在传统多数据知识的三元组实体关系抽取任务中,存在实体识别精度低下,实体关系抽取复杂度高、重叠率大、缺失数据多等问题。构建了一种基于双层注意力机制增强双特征的神经网络模型—BIO-VEC-DAM模型,提高了实体识别率,降低了实体关系抽取的复杂性与重叠率。BIO-VEC-DAM模型首先通过BIO标注法对多数据进行集成处理,减低数据冗杂,构建三元组的初始形态;然后利用改进BIO标注法提取三元组的位置特征,并采用WORD2Vec算法提取三元组的置信度特征;接着使用双层注意力机制增强位置特征与置信度特征,提高特征辨识度;最后将增强后的特征投入到神经网络模型中进行训练。实体关系抽取实验的仿真结果表明,BIO-VEC-DAM模型较传统模型与单一特征增强的CNN模型相比,准确率与召回率均有显著提升;且在实体识别任务中,经不同数据集的训练后,上述模型较其它模型相比,F1值平均提高了0.96%。所以,在三元组实体关系抽取任务中,构建的BIO-VEC-DAM模型性能更为优越。In the traditional triadic entity relationship extraction task with multiple data knowledge,there are problems such as low entity recognition accuracy,high complexity of entity relationship extraction,large overlap rate,and many missing data.Therefore,constructing a neural network model-BIO-VEC-DAM model based on a twolayer attention mechanism to enhance dual features,which improves the entity recognition rate and reduces the com⁃plexity and overlap rate of entity relationship extraction.The BIO-VEC-DAM model first integrated and processed multiple data by BIO annotation method to reduce data redundancy and the initial shape of the triad was constructed.Then the location features of the triad were extracted using the improved BIO labeling method and the confidence fea⁃tures of the triad were extracted using the WORD2Vec algorithm.Then the location features and confidence features were enhanced using the two-layer attention mechanism to improve the feature recognition. Finally, the enhanced fea⁃tures were put into the neural network model for training. The simulation results of entity relationship extraction exper⁃iments show that the BIO-VEC-DAM model constructed has significantly improved the accuracy and recall rate com⁃pared with the traditional model and the CNN model with single feature enhancement;and in the entity recognitiontask, the F1 value of this model is improved by 0. 96% on average compared with other models after training with dif⁃ferent data sets. Therefore, the BIO-VEC-DAM model constructed has a superior performance in the triadic entityrelationship extraction task.

关 键 词:数据集成 实体关系抽取 实体识别 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象