检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:屈潇雅 李兵[1] 温立强 QU Xiaoya;LI Bing;WEN Liqiang(School of Information Technology and Management,University of International Business and Economics,Beijing 100029,China;School of Software and Microelectronics,Peking University,Beijing 100871,China)
机构地区:[1]对外经济贸易大学信息学院,北京100029 [2]北京大学软件与微电子学院,北京100871
出 处:《计算机工程》2024年第9期63-71,共9页Computer Engineering
基 金:科技部国家重点研发计划(2020YFC0833304)。
摘 要:行政执法的智能化水平是国家治理能力现代化的体现,数据是智能化发展的重要依托。在行政执法领域,各行政机关存储大量以文本形式记录的历史案件,这种非结构化的数据价值密度较低、可利用性不强。利用事件抽取技术从行政执法案件文本中快速高效地抽取案件职权类型、案发时间、案发地点等结构化信息,可推动行政机关对历史案件信息的利用和智能化执法办案研究。收集整理某城市的真实案例数据,并通过人工标注构建一个行政执法领域的数据集,根据行政执法案件文本的无触发词、文档级、格式不固定等文本特征,提出结合基于Transformer的双向编码器表示(BERT)和基于条件随机场的双向长短期记忆网络(BiLSTM-CRF)模型的两阶段事件抽取方法,通过文本多分类和序列标注依次完成事件类型检测和事件论元抽取任务。实验结果表明,事件类型检测任务的F1值达到99.54%,事件论元抽取任务的F1值达到97.36%,实现了对案件信息的有效抽取。The level of intelligence in administrative law enforcement is a manifestation of the modernization of national governance capacity,and data is an important support for the development of intelligence.In the field of administrative law enforcement,various administrative organs store numerous historical cases recorded in textual form.These cases are unstructured data with low value density and limited usability.The use of event extraction technology for the quick and efficient extraction of structured information,such as the type of case authority and the time and place of case occurrence,from administrative law enforcement case texts can promote the utilization of historical case records and provide support for the study of intelligent law enforcement.This study collects and organizes real case data for a city and constructs a dataset in the field of administrative law enforcement through manual annotation.Considering text characteristics,such as no trigger words,document-level text,and unfixed format,the study then proposes a two-stage event extraction method based on a Bidirectional Encoder Representations from Transformers(BERT)model and a Bi-directional Long Short-Term Memory network with Conditional Random Field(BiLSTM-CRF)model,which sequentially detects event types and identifies event arguments through text multi-classification and sequence annotation.Experimental results show that the F1 values of event-type detection and event-argument extraction tasks reach 99.54%and 97.36%,respectively,thus realizing the effective extraction of case information.
关 键 词:行政执法案件 事件抽取 两阶段方法 基于Transformer的双向编码器表示模型 基于条件随机场的双向长短期记忆网络(BiLSTM-CRF)模型
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.128.29.244