机器学习方法对不明归属二程文献的判断  被引量:1

A Machine Learning Approach to the Judgment of Unidentified Attribution of Ercheng Sayings

在线阅读下载全文

作  者:毕梦曦 张力元 Bi Mengxi;Zhang Liyuan

机构地区:[1]北京大学哲学系 [2]北京大学信息管理系

出  处:《数字人文研究》2021年第2期21-35,共15页Digital Humanities Research

摘  要:以数字人文的新视角审视二程语录归属这一传统问题,运用机器学习的方法,将二程语录归属判断的问题转化为有监督的文本分类问题,构造BERT预训练语言模型加sigmoid激活函数的深度学习模型.以二程材料当中已知归属的文字作为模型的训练语料,对不明归属的二程语录之归属进行预测,准确率最高可达88%,证明了深度学习在小规模古汉语文本研究上的潜力.实验利用该训练后的模型,尝试对二程语录中不明归属的语录进行判断,发表了部分判断结果:《程氏遗书》中有30%、《程氏外书》中有20%是程颢语录,尤其值得关注的是对一些长久以来归属不明的著作文字比如《粹言》《经说》的作者进行了初步判断.From the new perspective of Digital Humanities,this paper examines the traditional problem of the attribution of Ercheng sayings.By using the method of machine learning,the problem of the attribution judgment of Ercheng sayings is transformed into a supervised text classification problem,and a deep learning model with BERT pretraining model and sigmoid activation function is constructed.In specific,this paper uses the texts which author is already signed as Cheng Yi or Cheng Haoto predict the text of the unknown attribution.The highest accuracy is about 0.88,which proves the potential of deep learning in small-scale ancient Chinese texts.In specific,this paper uses the texts which author is already signed as Cheng Yi or Cheng Hao to predict the text of the unknown attribution.The highest accuracy is about 0.88,which proves the potential of deep learning in small-scale ancient Chinese texts.Using the trained model,this paper tries to judge the unknown attribution in Ercheng saying and publishes some judgment results:30%of Erchengyishu and 20%of Erchengwaishu are attributed to Cheng Hao.Especially,it makes a preliminary judgment on the authors of some works with unknown attribution for a long time,such as Cui Yan and Jing Shuo.

关 键 词:二程语录 二程 程颐 程颢 BERT 机器学习 文本分类 

分 类 号:B244.6[哲学宗教—中国哲学] G256[文化科学—图书馆学] TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象