基于CART分类方法的期刊操纵引用行为识别建模研究  被引量:3

Research on Model for Recognition of Journal Citation Manipulation Behavior Based on CART

在线阅读下载全文

作  者:孙建军[1] 鞠秀芳[1] 裴雷[1] 郑彦宁[2] 潘云涛[2] 

机构地区:[1]南京大学信息管理学院,南京210093 [2]中国科学技术信息研究所,北京100038

出  处:《情报学报》2013年第10期1058-1067,共10页Journal of the China Society for Scientific and Technical Information

摘  要:当前,一些学术期刊在利益的驱使下,通过大量自引和结成“互引同盟”的方式快速提高被引频次和影响因子等指标,影响了引文分析的公平性。基于此,本文首先利用数据挖掘中的CART分类算法构建期刊操纵引用行为的识别模型,设计了识别操纵引用行为的4个评价指标:白引率、被引年代分布、被引密度比和引用密度比。并采用国内某引文数据库中的50本综合性社会科学期刊作为实验样本,采集该期刊群2009年的引文数据作为训练数据集,2008年的引文数据作为验证数据集。最后,运用2010年的引文数据对期刊操控行为识别模型的有效性进行验证,实验结果证明,本文构建的分类模型可以有效地对期刊引用操纵行为进行识别。Now some academic journals are driven by the interests in order to improve their cited frequencies and the impact factors of journals quickly by a large number of self-citations or by forming a citation alliance, which affects the fairness of citation analysis. According to the background, the paper first constructs a journal citation manipulation behavior recognition model by CART classification algorithm in data mining, and designs four evaluation indexes for recognizing the manipulation behavior: self-citation rate, cited era distribution, cited density ratio, citation density ratio. Then an experiment was carried out to verify the model with the data collected from a citation database of China. The experiment takes 50 journals in the field of comprehensive social science as its experimental sample, collects the citation data of these journals in 2009 as the training data set and takes the citation data of these journals in 2008 as the validation data set. Finally, the article chooses the citation data of these journals in 2010 to identify the validity of the journal manipulation behavior recognition model. The experiment result showed that the model can effectively recognize the journal citation manipulation behavior.

关 键 词:期刊引用操纵行为 CART算法 自引率 被引年代分布 被引密度比 引用密度比 

分 类 号:G353.1[文化科学—情报学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象