基于BiLSTM-CNN-CRF模型的维吾尔文命名实体识别  被引量:23

Uyghur Named Entity Recognition Based on BiLSTM-CNN-CRF Model

在线阅读下载全文

作  者:买买提阿依甫 吾守尔.斯拉木 帕丽旦.木合塔尔 杨文忠[1] Maimaitiayifu, SILAMU Wushouer, MUHETAER Palidan, YANG Wenzhong(College of Information Science and Engineering, Xinjiang University, Urumqi 830046, Chin)

机构地区:[1]新疆大学信息科学与工程学院,乌鲁木齐830046

出  处:《计算机工程》2018年第8期230-236,共7页Computer Engineering

基  金:国家重点基础研究计划项目(2014CB340506);国家自然科学基金(61363063);新疆大学多语种重点实验室开放课题(XJDX0905-2013-01)

摘  要:为在缺乏资源和不依赖人工特征的情况下提高维吾尔文命名实体的识别性能,构建基于BiLSTM-CNNCRF的神经网络模型。采用卷积神经网络训练具有维吾尔文单词的后缀、前缀等形态特征的字符向量,利用skipgram模型对大规模语料进行训练,生成具有语义信息的低维度稠密实数词向量。在此基础上,将字符向量、词性向量和词向量拼接的向量作为输入,构建适合维吾尔文命名实体识别的BiLSTM-CRF深层神经网络。实验结果表明,该模型能够解决命名实体的自动识别问题,具有较强的鲁棒性,F1值达到91.89%。In order to obtain better Uyghur Named Entity Recognition( NER) performance without the need of resources and relying on artificial features is an important problem to be solved. In this paper,a neural network model based on BiLSTM-CNN-CRF is constructed. Firstly,Convolutional Neural Network( CNN) is used to train character vectors with morphological characteristics such as suffix and prefix of Uyghur words. Then,skip-gram model is used to train large-scale corpus to generate word vectors with semantic information. Finally,a BiLSTM-CRF deep neural network suitable for Uyghur NER is constructed by using concatenated vectors which includes the character vectors,part-of-speech vectors and word vectors as input. Experimental results show that the proposed model can solve the problem of automatic recognition of named entities and has good robustness. Its F1 value reaches 91. 89 %.

关 键 词:递归神经网络 卷积神经网络 条件随机场 维吾尔文 命名实体识别 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象