基于BERT+ACRNN的唐卡文本关系分类算法研究  被引量:1

Study on a Thangka Text Relationship Classification Algorithm Constructed based on BERT+ACRNN

在线阅读下载全文

作  者:王昱 王铁君[1] 王鸿洋 郭晓然[1] WANG Yu;WANG Tiejun;WANG Hongyang;GUO Xiaoran(China National Information Technology Research Institute,School of Mathematics and Computer Science,Northwest University for Nationalities,Lanzhou 730030,China)

机构地区:[1]西北民族大学数学与计算机科学学院中国民族信息技术研究院,甘肃兰州730030

出  处:《高原科学研究》2022年第2期102-108,共7页Plateau Science Research

基  金:国家自然科学基金项目(62166035);甘肃省自然科学基金项目(21JR7RA163);中央高校国家民委专项项目(1001160448)。

摘  要:唐卡领域知识图谱的构建,需要进行唐卡文本关系分类,但实验发现使用传统模型卷积神经网络(CNN)和长短期记忆网络(LSTM)时模型泛化能力弱且语义特征提取能力不足,最终效果不佳。文章提出一种BERT-ACRNN模型,该模型使用BERT预训练语言模型获得上下文语义信息,分别通过CNN和带自注意力机制的双向长短期记忆网络(Bi-LSTM)获得文本的局部特征信息与上下文特征表示,再将两种特征信息进行融合,最后进行关系分类。实验结果表明,BERT-ACRNN模型在唐卡领域文本数据集上,F1值达到93.23%,相比于BERT模型高出4.68%,与BERT-CNN、BERT-BiLSTM相比F1值分别提升了2.69%和2.81%。To construct the knowledge graph of thangka,it is necessary to classify the text relations of thangka do⁃main.However,the previous experiments found out that using the traditional model convolutional neural network(CNN)and the long short-term memory network(LSTM)for Thangka text,the generalization ability is weak and the semantic feature extraction ability is insufficient.To solve these problems,a Bert-ACRNN model is proposed in this paper.In this model,firstly,the context semantic information is obtained using the BERT pre-training language model,then the local feature information and the context feature representation of the text are obtained using CNN and bidirectional LSTM(Bi-LSTM)with self-attention mechanism,and then the two feature informa⁃tion are fused,and finally the relationship of thangka text is classified.Our experimental results show that,the F1 value of the BERT-ACRNN model is 93.23%in the text data set of Thangka domain,which is 4.68%higher than that of the BERT model.Compared with BERT-CNN and BERT-BiLSTM,the F1 value increased by 2.69%and 2.81%,respectively.

关 键 词:唐卡文本 文本分类 BERT-ACRNN模型 预训练模型 融合特征 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象