基于BERT和TransE的众筹众创教育资源实体对齐研究  

Research on Entity Alignment of Crowdfunding and Crowd-creation Educational Resources Based on BERT and TransE

在线阅读下载全文

作  者:刘尚东[1,2] 胡林 谭萍[3] 季一木[1,2] 徐鹤[1,2] LIU Shangdong;HU Lin;TAN Ping;JI Yimu;XU He(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Jiangsu Research Engineering of HPC and Intelligent Processing,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Tongda College of Nanjing University of Posts and Telecommunications,Yangzhou 225127,China)

机构地区:[1]南京邮电大学计算机学院,南京210023 [2]南京邮电大学高性能计算与智能处理工程研究中心,南京210023 [3]南京邮电大学通达学院,扬州225127

出  处:《北京印刷学院学报》2022年第6期58-67,共10页Journal of Beijing Institute of Graphic Communication

基  金:江苏省高等教育教改研究立项课题“‘互联网+教育’背景下基于微助教平台的课程在线资源建设与应用”(2019JSJG276);南京邮电大学通达学院教学改革研究项目“‘互联网+教育’背景下基于微助教平台的高校课程教学资源信息化建设与应用”(JG20619013)。

摘  要:随着计算机技术与网络技术的发展,数字教育资源众筹众创成为可能,质量是众筹众创数字教育资源服务的重要一环,是发挥众筹众创数字教育资源价值的基础。高质量知识图谱是数字教育资源生成的重要工具,实体对齐是知识图谱构建的关键环节。在众筹众创教育资源生成环节,多源知识图谱的命名空间异构性较高,采用已有的实体对齐方法消除实体异构,存在自动程度不高,没有充分利用知识图谱属性信息的问题。对于包含丰富属性信息的众筹众创教育资源知识图谱,如何利用这些属性信息提高实体对齐的效率及准确度是众筹众创教育资源生成的核心问题。提出一种利用关系三元组及属性三元组向量表示实体的方法,采用预训练语言模型和知识表示技术训练迭代,针对关系三元组,将BERT(Bidirectional Encoder Representation from Transformers)的语义表示能力迁移到TransE(Translating Embeddings)模型的初始化阶段,对原始向量空间,通过TransE模型迭代训练,提高结构层面的效率和准确率,针对属性三元组,基于BERT进行向量表示,根据翻译模型和TF-IDF权重分配策略得到属性层面的实体向量,最后联合得出实体向量。实验表明,方法具有更好的实体对齐性能。With the development of computer technology and network technology,CCDER(Crowdfunding and Crowd-creation of Digital Educational Resources)has become possible.Quality is an important part of the services of CCDER,and it is the basis for giving full play to the value of CCDER.High-quality knowledge graph is an important tool for generating digital educational resources,and entity alignment is a critical step in the construction of knowledge graph.In the process of CCDER generation,the namespace heterogeneity of multi-source knowledge graphs is relatively high.The existing entity alignment method is used to eliminate entity heterogeneity,which has the problem of lack of automation and insufficient use of knowledge graph attribute information.For the knowledge graph of CCDER containing rich attribute information,how to use this attribute information to improve the efficiency and accuracy of entity alignment is the core issue of CCDER generation.The paper proposes a method to represent entities by using relation triplet and attribute triplet vector,using pre-trained language model and knowledge representation technology to train and iterate,for relation triplet,the semantic representation ability of BERT(Bidirectional Encoder Representation from Transformers)is presented.Migrate to the initialization stage of the TransE(Translating Embeddings)model.For the original vector space,the TransE model is iteratively trained to improve the efficiency and accuracy of the structure level.For attribute triples,vector representation is performed based on BERT.The IDF weight distribution strategy obtains the entity vector at the attribute level,and finally jointly obtains the entity vector.Experiments show that the method has better entity alignment performance.

关 键 词:众筹众创 教育资源 实体对齐 TransH预训练语言模型 知识表示 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象