汉译藏传佛教典籍中的神灵命名实体识别方法研究被引量：1

Research on Deities Named Entity Recognition Method in Tibetan Buddhist Classics Translated in Chinese

作　　者：郭晓然[1] 王维兰[2] 罗平 GUO Xiaoran;WANG Weilan;LUO Ping(School of Mathematics and Computer Science,Northwest Minzu University,Lanzhou 730030,China;Key Laboratory of China's Ethnic Languages and Information Technology,Northwest Minzu University,Lanzhou 730030,China;School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China)

机构地区：[1]西北民族大学数学与计算机科学学院,甘肃兰州730030 [2]西北民族大学中国民族语言文字信息技术教育部重点实验室,甘肃兰州730030 [3]兰州交通大学电子与信息工程学院,甘肃兰州730070

出　　处：《高原科学研究》2020年第4期87-94,共8页Plateau Science Research

基　　金：国家自然科学基金项目(61162021);国家民委创新团队计划(〔2018〕98号);中央高校青年教师创新项目(31920200067).

摘　　要：命名实体识别是自然语言处理中的一项基础性关键任务。针对汉译藏传佛教典籍中各种神灵名称难以识别的问题,提出一种基于BERT预训练语言模型、双向长短时记忆网络(BiLSTM)和条件随机场(CRF)的多神经网络融合方法BERT-BiLSTM-CRF-a。该方法使用BERT代替浅层网络训练字向量,充分表征字的多义性;引入注意力机制的权重思想将BiLSTM层的前向和后向隐层向量加权后再拼接,进一步提高了上下文特征的有效利用率;最后使用CRF模型输出序列上的最优标注结果。实验表明,该方法在测试集上准确率达95.2%,较传统的BiLSTM-CRF模型提升7.6%,召回率也高出8.7%,因此能够应用于汉译藏传佛教典籍中神灵名称识别任务。Named entity recognition is a basic task in natural language processing,but also an important foundation of knowledge graph construction.In view of the fact that the naming patterns of various deities are not fixed and difficult to identify in Tibetan buddhist classics translated in Chinese,a multi-neural network fusion method BERT-BILSTM-CRF-a was proposed based on the BERT pre-training language model,Bidirectional Long Short-Term Memory(BiLSTM)and Conditional Random Field(CRF).The model used the BERT pre-training method instead of traditional shallow neural network to train word vector to fully represent the polysemy of the words.In addition,the weight thinking of attention mechanism is introduced to weight the forward and backward LSTM hidden layer vectors before concatenate to further improve the effective utilization of context features.Finally,the CRF model is used to label texts and the to output optimal label resulted on the sentence sequence.The experimental results showed that the model has an accuracy of 95.2%in the test set,7.6%higher than BiLSTM-CRF model,and the recall rate is also 8.7%higher than that of BiLSTM-CRF model.Therefore,BERTBILSTM-CRF-a can be effectively applied to the task of deities named entity recognition in the field of Tibetan Buddhism.

关键词：藏传佛教神灵命名实体识别 BERT预训练模型注意力机制

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

汉译藏传佛教典籍中的神灵命名实体识别方法研究被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

汉译藏传佛教典籍中的神灵命名实体识别方法研究 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

汉译藏传佛教典籍中的神灵命名实体识别方法研究被引量：1