基于朴素贝叶斯分类器的古诗词作者争议检测  

Detection of Dispute on Authors of Ancient Chinese Poems by Naive Bayes Classifier

在线阅读下载全文

作  者:黄玮 冉启斌[1] Huang Wei;Ran Qibin

机构地区:[1]南开大学文学院

出  处:《文学与文化》2023年第3期95-104,共10页Literature and Culture Studies

摘  要:本文收集了《蝶恋花(庭院深深深几许)》等六首作者存在争议的作品所涉及作者的其他作品作为训练语料,经过分词和特征提取后,使用朴素贝叶斯分类器学习作者特征,随后对争议作品进行作者判断。判断结果详细显示了各首争议作品的作者可能性,除《生查子·元夕》外,其余判断结果与文献考证的契合度较高。本文还收集了三组唐朝并称诗人——“元白”“皮陆”“小李杜”的作品,使用朴素贝叶斯分类器进行作者判断,取得了较好的效果,进一步验证了该方法在作者检测上的有效性。This paper collects six poems,whose authors are disputed,including“Deep,Deep the Courtyard”to the tune of Butterfly in Love with Flowers.Taking other works of the alleged authors as corpus,we try to analyze the word segmentation and summarize their features by applying Naive Bayes classifier,so as to make judgments on the authorship of the disputed works.All the probabilities of authorship for the disputed works can be shown in details,which were highly consistent with the literature research except for the poem of“Lantern Festival”to the tune of Shengzhazi.This paper also collects works of three pairs of the Tang poets often mentioned in the same breath:Yuan Zhen and Bai Juyi,Pi Rixiu and Lu Guimeng,and Li Shangyin and Du Mu.The desirable results have been achieved by using the Naive Bayes classifier to judge the authorship,further validating the effectiveness of this method in authorship detection.

关 键 词:作者争议 作品风格特征 朴素贝叶斯分类器 古诗词 

分 类 号:I207.2[文学—中国文学] TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象