基于Bi-LSTM的古籍事件句触发词分类方法研究  被引量:3

The Trigger Verb Classification Method of Event Sentences in Ancient Chinese Classics Based on Bi-LSTM

在线阅读下载全文

作  者:马晓雯 何琳[1] 刘建斌 李章超 高丹 MA Xiaowen;HE Lin;LIU Jianbin;LI Zhangchao;GAO Dan(College of Information Management,Nanjing Agricultural University,Nanjing 210095)

机构地区:[1]南京农业大学信息管理学院,南京210095

出  处:《农业图书情报学报》2021年第9期27-36,共10页Journal of Library and Information Science in Agriculture

基  金:国家社科基金项目“基于典籍的中华传统文化知识表达体系自动构建方法研究”(18BTQ063)。

摘  要:[目的/意义]开展面向数字人文的古籍触发动词识别及分类研究,对于古籍文本的深层次挖掘和内容揭示具有重大的意义。本文利用深度学习分类算法,探索依据古籍触发词进行事件句文本多元分类的自动化方法。[方法/过程]在构建了典籍事件触发词分类体系和触发词典的基础上,选取4个不同类别的事件句文本作为实验数据,利用Onehot和Tokenizer对类别标签和句子文本进行分别编码后,输入Bi-LSTM模型中训练分类器,并通过调整参数设置了对比实验,采取通用的评价指标分析了分类器的性能。[结果/结论]经过多次训练和调整之后得到的分类器,在测试集的评估中精确度达到了0.95,证明基于深度学习的实验方法和构建的触发词数据集能够有效的帮助我们实现古籍事件句文本的自动化多元分类。[Purpose/Significance]It is of great significance to carry out research on the recognition and classification of trigger verbs in ancient books oriented to digital humanities for the deep mining and content revealing of ancient texts.This paper uses the deep learning classification algorithm to explore an automated method for multivariate classification of event sentence text based on trigger words in ancient books.[Method/Process]Based on the con struction of the classic event trigger word classification system and trigger dictionary,four different types of event sentence texts are selected as experimental data,and the category labels and sentence texts are coded separately us ing Onehot and Tokenizer,and then the classifier is trained in the Bi-LSTM model,and a comparative experiment is set by adjusting the parameters,and the performance of the classifier is analyzed by using a general evaluation index.[Results/Conclusions]The classifier after many training and adjustments has an accuracy of 0.95 in the evaluation of the test set,which proves that the experimental method based on deep learning and the constructed trigger word data set can effectively help us realize automatic multivariate classification of event sentence text of ancient books.

关 键 词:触发词分类 Bi-LSTM模型 多元分类 《左传》 

分 类 号:G350[文化科学—情报学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象