基于词汇-语义模式的金融事件信息抽取方法  被引量:17

Information extraction method of financial events based on lexical-semantic pattern

在线阅读下载全文

作  者:罗明[1] 黄海量[1,2] 

机构地区:[1]上海财经大学信息管理与工程学院,上海200433 [2]上海财经大学上海市金融信息技术研究重点实验室,上海200433

出  处:《计算机应用》2018年第1期84-90,共7页journal of Computer Applications

基  金:上海市科技人才计划项目(14XD1421000);上海市科技创新行动计划项目(16511102900)~~

摘  要:信息抽取是自然语言处理工作中的重要任务之一。针对由于自然语言的多样性、歧义性和结构性而导致的信息抽取困难的问题,提出了一种面向金融事件信息抽取的层次化词汇-语义模式方法。首先,定义了一个金融事件表示模型;然后应用基于深度学习的词向量方法来实现自动生成同义概念词典;最后采用基于有限状态机驱动的层次化词汇-语义规则模式实现了对各类金融事件信息自动抽取的目标。实验结果表明,所提方法可以从金融新闻文本中准确地抽取出各类金融事件信息,并且对26类金融事件的微平均识别准确率达到93.9%,微平均召回率达到86.9%,微平均F1值达到90.3%。Information extraction is one of the most important tasks in natural language processing. A hierarchical Lexical-Semantic Pattern (LSP) method for the extraction of financial events was proposed for the problem of information extraction in natural language processing due to linguistic diversity, ambiguity and structure. Firstly, a financial event representation model was defined. Secondly, a word vector method based on deep learning was used to realize the automatic generation of synonymous concept ~lictionary. Finally, some hierarchical LSPs based on finite state machine were used to extract various kinds of financial events. The experimental results show that by using the proposed method various kinds of financial events can be accurately extracted from the financial news text, and for 26 types of financial events recognition the micro average precision is 93.9%, the micro average recall is 86.9%, the micro average F1 value reaches 90.3%.

关 键 词:词汇-语义模式 信息抽取 金融事件 词向量 词列表 概念词典 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象