基于BERT⁃BiLSTM⁃CRF的非法出入境笔录文本提取模型  

Text extraction model for llegal entry exit records based on BERT-Bil STM-CRF

在线阅读下载全文

作  者:郭婧婧 李俊杰 周卫[1] 韦艳艳[1] GUO Jingjing;LI Junjie;ZHOU Wei;WEI Yanyan(Guangxi Minzu University,Nanning 530006,China;Information Technology Department of Guangxi Entry Exit Frontier Inspection Station,Nanning 530022,China)

机构地区:[1]广西民族大学,南宁530006 [2]广西出入境边防检查总站信息科技处,南宁530022

出  处:《计算机应用文摘》2023年第13期43-45,共3页Chinese Journal of Computer Application

摘  要:为提高非法出入境笔录信息提取方面的命名实体识别能力,提出了一种融合语言模型的非法出入境笔录信息提取模型。该模型首先利用BERT模型对输入序列中的单词进行编码,得到每个单词的向量表示,然后将这些向量输入到BiLSTM网络中,利用LSTM网络对输入序列进行建模,学习输入序列中的上下文信息和语法结构等。最后,通过一个CRF层对BiLSTM网络的输出进行标注,从而得到最终的输出序列。实验结果表明,该模型能较好地应用于非法出入境笔录文本提取的任务。在与广西边防检查总站的合作项目里,最终将该模型应用于实际生产工作中,为边检警方的笔录提取工作提供便利。In order to improve the ability of named-entity recognition in illegal entry and exit record information extraction,an illegal entry and exit record information extraction model integrating language model was proposed.This model first uses the BERT model to encode the words in the input sequence,obtaining a vector representation of each word.Then,these vectors are inputted into the BiLSTM network,and the LSTM network is used to model the input sequence,learning contextual information and grammar structure in the input sequence.Finally,the output of the.BiLSTM network is annotated through a CRF layer to obtain the final output sequence.According to experiments,this model can be well applied to the task of extracting illegal entry and exit transcripts.In the cooperation project with the Guangxi Border Inspection Station,the model was ultimately applied to practical production work,providing convenience for the border inspection police s record extraction work.

关 键 词:非法出入境笔录文本 命名实体识别 BERT预训练语言模型BiLSTM CRF 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象