基于多模型融合的警情要素提取  被引量:1

Elements Extraction of Alarm Text Based on Multiple Model Fusion

在线阅读下载全文

作  者:龚艳[1] 汪玉 梁昌明[1] 黄林钰 乐汉 徐圣婴 王本强 GONG Yan;WANG Yu;LIANG Chang-ming;HUANG Lin-yu;YUE Han;XU Sheng-ying;WANG Ben-qiang(Science and Technology Department,Shanghai Public Security Bureau,Shanghai 200042,China;DATATOM,Shanghai 200030,China)

机构地区:[1]上海市公安局科技处,上海200042 [2]上海德拓信息技术股份有限公司,上海200030

出  处:《软件导刊》2022年第4期98-102,共5页Software Guide

基  金:上海市科学技术委员会科研计划项目(18DZ1200900)。

摘  要:针对警情数据中日益增长的不同种类要素提取需求,提出一种多模型融合的要素提取方法。对于警情数据中人名、地名、机构名等无明显规律的要素,采用BERT+BiLSTM+CRF模型结合文本上下文信息的方法,抽取包含语义信息的关键要素;对于时间、车牌号等具备一定规律的数据,采用模式识别方法抽取符合定义规则的相关要素;然后将上述两种方法融合,形成一体化模型进行要素提取。验证实验结果表明,与传统命名实体识别方法相比,BERT+BiLSTM+CRF模型在测试集上的F1值均提高3%以上,模式识别效果提高1%以上,可满足日常警情的要素提取需求。For the growing demand of different kinds of elements extraction in alarm text,a multiple model fusion method is proposed.For the irregular elements such as name,place and organization,the method of BERT+BiLSTM+CRF is adopted,This method extracts the key ele⁃ments including semantic information by combining the context information of the text;for time,license plate number and other data with rele⁃vant rules,using the pattern recognition method,this method can extract the relevant elements in line with the defined rules,with a high re⁃call rate;finally,the above two models are fused to form an integrated model for feature extraction.The experimental results show that com⁃pared with the others named entity recognition methods,the F1 value of BERT+BiLSTM+CRF model is improved by more than 3%and the pattern recognition effect is improved by more than 1%,which can meet the element extraction needs of daily police information.

关 键 词:警情要素 BERT BiLSTM CRF 命名实体识别 模式识别 

分 类 号:TP274[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象