基于transformer的python命名实体识别模型  被引量:3

Python named entity recognition model based on transformer

在线阅读下载全文

作  者:徐关友 冯伟森[1] XU Guanyou;FENG Weisen(College of Computer Science,Sichuan University,Chengdu Sichuan 610065,China)

机构地区:[1]四川大学计算机学院,成都610065

出  处:《计算机应用》2022年第9期2693-2700,共8页journal of Computer Applications

摘  要:最近一些基于字符的命名实体识别(NER)模型无法充分利用词信息,而利用词信息的格子结构模型可能会退化为基于词的模型而出现分词错误。针对这些问题提出了一种基于transformer的python NER模型来编码字符-词信息。首先,将词信息与词开始或结束对应的字符绑定;然后,利用三种不同的策略,将词信息通过transformer编码为固定大小的表示;最后,使用条件随机场(CRF)解码,从而避免获取词边界信息带来的分词错误,并提升批量训练速度。在python数据集上的实验结果可以看出,所提模型的F1值比Lattice-LSTM模型高2.64个百分点,同时训练时间是对比模型的1/4左右,说明所提模型能够防止模型退化,提升批量训练速度,更好地识别python命名实体。Recently,some character-based Named Entity Recognition(NER)models cannot make full use of word information,and the lattice structure model using word information may degenerate into a word-based model and cause word segmentation errors. To deal with these problems,a python NER model based on transformer was proposed to encode character-word information. Firstly,the word information was bound to the characters corresponding to the beginning or end of the word. Then,three different strategies were used to encode the word information into a fixed-size representation through the transformer. Finally,Conditional Random Field(CRF)was used for decoding,thereby avoiding the problem of word segmentation errors caused by obtaining the word boundary information as well as improving the batch training speed.Experimental results on the python dataset show that the F1 score of the proposed model is 2. 64 percentage points higher than that of the Lattice-LSTM model,and the training time of the proposed model is about a quarter of the comparison model,indicating that the proposed model can prevent model degradation,improve batch training speed,and better recognize the python named entities.

关 键 词:命名实体识别 词边界 PYTHON 词信息 TRANSFORMER 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象