检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:马健伟 王铁鑫 江宏 陈涛 张超 李博涵[1] Ma Jianwei;Wang Tiexin;Jiang Hong;Chen Tao;Zhang Chao;Li Bohan(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106)
机构地区:[1]南京航空航天大学计算机科学与技术学院,南京211106
出 处:《计算机研究与发展》2024年第5期1325-1335,共11页Journal of Computer Research and Development
基 金:国家自然科学基金面上项目(61872182)。
摘 要:卷宗作为公安机关办案、结案的主要记录,包含大量关键的警务信息.面向警务卷宗的信息抽取是分析案情、挖掘犯罪趋势、提高治安管理水平的重要手段.卷宗类文本多由基层警务人员采用自然语言书写,关键信息抽取难度大.传统的警务卷宗信息抽取,多依赖人工及预定义模板,效率低且通用性差.针对以上问题,参考卷宗的警务特征,提出了一种基于深度语义分析的卷宗知识抽取方法.该方法包含命名实体识别与关系抽取2个核心内容.提出的命名实体识别方法,融合了汉字结构特征和字形特征;提出的关系抽取方法建立在实体识别的基础上,实现基于触发规则和触发词的2种抽取模式.在公开的微博数据集、项目合作方**市**分局的真实卷宗集上,提出的命名实体识别方法对比基线方法,在实体识别精确率及召回率上综合表现优异;自动抽取的关系也得到**分局的认可.相关信息系统已在**分局部署使用.Police dossier,as one of the main records handled by the police department,contains massive and crucial policing information.As an important means,efficient information extraction from police dossier is helpful for case analysis,crime trend prediction,and the improvement of the public security management.However,the text of police dossier is written by police officers using natural language,which makes it difficult to extract crucial information.Traditional information extraction of police dossier heavily relies on manual effort and predefined templates,resulting in low efficiency and poor generality.Considering the particularity of police dossier,in this paper,a knowledge extraction method based on deep semantics analysis is proposed.This method consists of two core tasks:named entity recognition and relation extraction.Focusing on Chinese text,we propose a named entity recognition method that integrates structural and glyph features of Chinese characters.On the basis of entity recognition results,with the help of a specially constructed policing thesaurus,a relationship extraction method combining rule based and trigger word is proposed.Both on a publicly available Weibo dataset and a real dossier dataset provided by our partner a local police department,compared with several baseline named entity recognition models,our proposed method shows better performance in classifying exact entities and finding more potential entities.The automatically extracted relationships have also been verified and committed by the police department branch.A particular information system has been used in practice.
关 键 词:智慧警务 警务卷宗 知识抽取 命名实体识别 关系抽取
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7