融合旋转式位置编码与图递归检索方法的书院事件抽取研究  

Research on the Extraction of Academy Events by Integrating Rotating Position Encoding and Graph Recursive Retrieval Method

在线阅读下载全文

作  者:喻雪寒 何琳[1] YU Xuehan;HE Lin

机构地区:[1]南京农业大学信息管理系,江苏南京210095

出  处:《大学图书馆学报》2025年第2期50-65,共16页Journal of Academic Libraries

摘  要:书院是我国古代独特的教育机构,而《中国书院辞典》作为记载书院的重要资料,收纳自唐代至清代全国有史可考的书院多达1600余所。为全面、系统地整理与提取有效数据,文章在对事件抽取各类模式与方法综述的基础上,探索出综合旋转式位置编码与图递归检索的方法以抽取书院的事件信息:利用RoFormerV2模型对绝对位置进行编码,使每个向量附带相对位置信息,之后借助全局归一化思想通过嵌套实体识别模型GlobalPointer和完全子图搜索方式递归查找事件类型与论元。在《中国书院辞典》上进行的实验表明,该方法能有效融合向量的位置和语义信息并对论元间的关联性进行建模,克服了长文本引发的信息缺失与事件论元的嵌套问题,并具备良好的外推性。Academies were unique educational institutions in ancient China.The Chinese Academy Dictionary,as an important material for recording academies,contained more than 1600 academies that could be examined from the Tang Dynasty to the Qing Dynasty,which was of great value in revealing the historical inheritance of regional Confucian culture.After sorting out the text collection of the Chinese Academy Dictionary,we found that this kind of corpus has two characteristics:on the one hand,the entries are based on the academy,and the number of words in some entries exceeds the text input requirements of the conventional pre-training model;On the other hand,there is a phenomenon that different event types share the same trigger word,meaning that one trigger word can represent multiple event types,while the traditional event extraction task regards trigger word recognition as a sequence annotation task,ignoring the correlation between trigger words and event arguments.In order to solve the above problems and comprehensively and systematically sort out and extract the data of academies,based on the review of various modes and methods of event extraction,we developed a comprehensive method integrating rotary position encoding and graph recursive retrieval to extract the event information of academies:the RoFormerV2 model was used to encode the absolute position,so that each vector was attached with the relative position information,and then the event types and arguments were recursively found through the nested entity recognition model GlobalPointer and the complete subgraph search method with the help of the idea of global normalization.Experiments on the Chinese Academy Dictionary showed that this method effectively integrated the position and semantic information of vectors and model the relevance between arguments,and overcame the lack of information caused by long texts and the nesting of event arguments,and had good extrapolation ability.Additionally,based on the existing event extraction results,this paper analyzed the

关 键 词:中国书院辞典 事件抽取 RoFormerV2 GlobalPointer 图递归检索 

分 类 号:K061[历史地理—历史学] TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象