面向行政执法案件文本的事件抽取研究  

Research on Event Extraction for Administrative Law Enforcement Case Texts

在线阅读下载全文

作  者:屈潇雅 李兵[1] 温立强 QU Xiaoya;LI Bing;WEN Liqiang(School of Information Technology and Management,University of International Business and Economics,Beijing 100029,China;School of Software and Microelectronics,Peking University,Beijing 100871,China)

机构地区:[1]对外经济贸易大学信息学院,北京100029 [2]北京大学软件与微电子学院,北京100871

出  处:《计算机工程》2024年第9期63-71,共9页Computer Engineering

基  金:科技部国家重点研发计划(2020YFC0833304)。

摘  要:行政执法的智能化水平是国家治理能力现代化的体现,数据是智能化发展的重要依托。在行政执法领域,各行政机关存储大量以文本形式记录的历史案件,这种非结构化的数据价值密度较低、可利用性不强。利用事件抽取技术从行政执法案件文本中快速高效地抽取案件职权类型、案发时间、案发地点等结构化信息,可推动行政机关对历史案件信息的利用和智能化执法办案研究。收集整理某城市的真实案例数据,并通过人工标注构建一个行政执法领域的数据集,根据行政执法案件文本的无触发词、文档级、格式不固定等文本特征,提出结合基于Transformer的双向编码器表示(BERT)和基于条件随机场的双向长短期记忆网络(BiLSTM-CRF)模型的两阶段事件抽取方法,通过文本多分类和序列标注依次完成事件类型检测和事件论元抽取任务。实验结果表明,事件类型检测任务的F1值达到99.54%,事件论元抽取任务的F1值达到97.36%,实现了对案件信息的有效抽取。The level of intelligence in administrative law enforcement is a manifestation of the modernization of national governance capacity,and data is an important support for the development of intelligence.In the field of administrative law enforcement,various administrative organs store numerous historical cases recorded in textual form.These cases are unstructured data with low value density and limited usability.The use of event extraction technology for the quick and efficient extraction of structured information,such as the type of case authority and the time and place of case occurrence,from administrative law enforcement case texts can promote the utilization of historical case records and provide support for the study of intelligent law enforcement.This study collects and organizes real case data for a city and constructs a dataset in the field of administrative law enforcement through manual annotation.Considering text characteristics,such as no trigger words,document-level text,and unfixed format,the study then proposes a two-stage event extraction method based on a Bidirectional Encoder Representations from Transformers(BERT)model and a Bi-directional Long Short-Term Memory network with Conditional Random Field(BiLSTM-CRF)model,which sequentially detects event types and identifies event arguments through text multi-classification and sequence annotation.Experimental results show that the F1 values of event-type detection and event-argument extraction tasks reach 99.54%and 97.36%,respectively,thus realizing the effective extraction of case information.

关 键 词:行政执法案件 事件抽取 两阶段方法 基于Transformer的双向编码器表示模型 基于条件随机场的双向长短期记忆网络(BiLSTM-CRF)模型 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象