基于句法语义依存分析的中文金融事件抽取  被引量:29

Chinese Financial Event Extraction Base on Syntactic and Semantic Dependency Parsing

在线阅读下载全文

作  者:万齐智[1,3] 万常选 胡蓉[2,3] 刘德喜 WAN Qi-Zhi;WAN Chang-Xuan;HU Rong;LIU De-Xi(School of Information Technology,Jiangxi University of Finance and Economics,Nanchang 330032;School of Software and lnternet of Things Engineering,Jiangxi University of Finance and Ecomomics,Nanchang330032;Jiangri Key Labroratory of Data and Knowledge Enginering,Jiangxi University of Finance and Economics,Nanchang0330013)

机构地区:[1]江西财经大学信息管理学院,南昌330032 [2]江西财经大学软件与物联网工程学院,南昌330032 [3]江西财经大学数据与知识工程江西省高校重点实验室,南昌330013

出  处:《计算机学报》2021年第3期508-530,共23页Chinese Journal of Computers

基  金:国家自然科学基金项目(61972184,61562032,61762042);江西省教育厅科学技术研究项目(GJJ180198,GJJ180252)资助。

摘  要:事件抽取在自然语言处理应用中扮演着重要的角色,如股票市场趋势预测.传统事件抽取较为关注触发词和论元所属类型的正确性,较少地结合应用需求去分析研究事件抽取效果及使用价值.在财经领域,事件作用对象及动作是关注的重点.因此,本文聚焦于金融事件,抽取三元组事件ET(Sub,Pred,Obj).在中文财经新闻中,存在大量事件嵌套和成分共享等现象,致使易出现事件漏抽和事件成分缺失的情况.为了解决这些问题,本文建立一个句法和语义依存分析相结合的中文事件抽取框架,归纳了4种常见缺省结构,并设计相应的补全规则.首先,基于句法依存树,分析动词词法和句法结构,建立核心动词链,使得每个核心动词对应一个事件,解决事件漏抽问题.然后,在句法依存树的基础上添加语义依存关系,建立事件间语义关联,得到句法语义依存分析(Syntactic Semantic Dependency Parsing,SSDP)树.第三,调整SSDP树,优化句法结构,形成SSDP图,使得同等句法结构的词结点处于相同层级,为后续事件抽取提供途径.第四,归纳4种常见缺省结构,设计相应补全规则,解决事件成分缺失问题.最后,在中文财经新闻标题和CoNLL2009中文语料上进行详细的实验测试,实验结果表明该方法是有效的.As a sub-task of information extraction,event extraction plays an important role in nature language process applications,such as stock market trend forecast,which can provide strong clues for events users,e.g.investors,managers and government,to analyze the market and make decisions.At present,most of the studies about event extraction pay more attention to the type correctness of triggers and arguments,and not consider the effect and value of event extraction based on application requirements.We call this type of event extraction traditional event extraction.The event types and standards in traditional event extraction are derived from ACE2005 containing 8 categories and 33 sub-categories,KBP2015 and ERE,et al.However,there are some limitations in application of them to event extraction in specific financial domain.For example,there is not the overweight event type in ACE2005,which is a special behavior in the financial domain.In this paper,we focus on the financial news and extract open events without types.In the field of finance and economics,most event users are more concerned with the objects and actions that events affect.Therefore,combined with the application requirement,we propose to extract the financial event ET(Sub,Pred,Obj),where Sub,Pred and Obj represent subject,predicate and object respectively.However,Chinese financial news generally suffers from the event nesting and component default problem,which result in event omission and key element missing of events.To tackle this issue,with the expression habits and characteristics of Chinese linguistics,we build a Chinese event extraction framework based on syntactic and semantic dependency parsing.Then summarize four common default structures and design corresponding completion rules.In particular,at the beginning of this paper,we summarize four prominent phenomena in the extraction of events from the headlines of financial news,and explore the cause of these problems,no in-depth analyzing the relevance of syntactic and semantic structure or lack of i

关 键 词:中文事件抽取 核心动词链 句法语义依存分析图 事件语义关联 缺省补全 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象