影响因子操纵期刊识别与分类方法构建与应用被引量：2

Identification and classification of journals of impact factor manipulation

作　　者：姜丰辉[1] 刘祥鹏[2] 邵巍[3] 陈春平[1] 于龙振[4] JIANG Fenghui;LIU Xiangpeng;SHAO Wei;CHEN Chunping;YU Longzhen(Editorial Office of Journal of Qingdao University of Science and Technology(Natural Science Edition),99 Songling Road,Laoshan District,Qingdao 266061,China;School of Mathematics and Physics,Qingdao University of Science and Technology,99 Songling Road,Laoshan District,Qingdao 266061,China;College of Automation and Electronic Engineering,Qingdao University of Science and Technology,99 Songling Road,Laoshan District,Qingdao 266061,China;College of Economics and Management,Qingdao University of Science and Technology,99 Songling Road,Laoshan District,Qingdao 266061,China)

机构地区：[1]《青岛科技大学学报(自然科学版)》编辑部,山东省青岛市266061 [2]青岛科技大学数理学院,山东省青岛市266061 [3]青岛科技大学自动化与电子工程学院,山东省青岛市266061 [4]青岛科技大学经济与管理学院,山东省青岛市266061

出　　处：《中国科技期刊研究》2023年第2期136-143,共8页Chinese Journal of Scientific and Technical Periodicals

基　　金：中国高校科技期刊研究会项目“基于大数据与人工智能算法的期刊影响因子操纵模式识别与对策”(CUJS-CX-2021-029);山东省教育厅项目“山东省高等学校期刊高质量发展建设项目”(JYTQKB202211)。

摘　　要：【目的】严重的期刊影响因子操纵现象影响了影响因子客观性,这种不正当行为应该被严格禁止,识别受操纵期刊的有效方式亟待发掘。【方法】以Web of Science平台发布的历年JCR数据为研究对象,选取正常期刊和异常(因影响因子受操纵而被镇压)期刊的14个文献计量学指标的历年数据,形成正常和异常2个期刊数据集。利用Python Scikit-learn库编写机器学习算法程序,对由正常、异常期刊数据集合并生成的训练集、验证集和测试集分别进行分类、训练、验证、测试。【结果】机器学习算法可以有效地对正常、异常期刊数据集进行分类,对验证集分类的准确率、精确率和召回率均达到98%以上,对算法最重要的5个特征的特征重要性为91.55%。部分算法对镇压后恢复正常期刊在镇压后第5年的数据的识别效果开始降低,所有编辑关注期刊均被分类为异常期刊,2021版JCR镇压期刊及镇压预警期刊均被准确分类为异常期刊。支持向量机算法具有最好的预测效果。【结论】机器学习算法在识别影响因子操纵期刊上具有天然的快速性和客观性优势。随着对影响因子的操纵手法及文献计量学指标不断增多,人工综合各种指标来识别、判定受操纵期刊的难度越来越大,各种机器学习算法的优势不断凸显。[Purposes] The serious manipulation of journal impact factors has seriously affected its objectivity, and this improper behavior should be strictly prohibited. It is urgent to find effective methods for identifying manipulated journals. [Methods] Taking the JCR data published on the Web of Science platform as the research object, the data on 14 bibliometrics indexes of normal journals and abnormal(suppressed due to manipulation of impact factors) journals were selected to form two data sets(normal and abnormal). Python Scikit-learn library was used to compile machine learning algorithm program to classify, train, verify, and test the training set, verification set, and test set generated from the normal and abnormal combined data set. [Findings] The machine-learning algorithm effectively classifies the normal and abnormal journal data sets, with precision, accuracy, and recall rate in data validation sets reaching more than 98%. The feature importance of the 5 most important features of the algorithm is 91.55%. The recognition effect of some algorithms on the data of the fifth year after the suppression of the journals restored to normal begins to decline. All the journals concerned by editors are classified as abnormal journals. The 2021 edition JCR suppression and suppression-warning journals are accurately classified as abnormal journals. Support vector machine algorithm has an optimal prediction effect. [Conclusions] The machine-learning algorithm has natural advantages of rapidity and objectivity in the recognition of journals of impact factors manipulation. With the increasing number of manipulation methods of impact factors and bibliometric indicators, it is more and more difficult to manually synthesize various indicators for identification and judgment, and the advantages of various machine-learning algorithms are continuously reflected.

关键词：影响因子操纵 JCR镇压期刊 JCR编辑关注期刊 JCR指标机器学习自动识别

分类号：G237.5[文化科学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

影响因子操纵期刊识别与分类方法构建与应用被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

影响因子操纵期刊识别与分类方法构建与应用 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

影响因子操纵期刊识别与分类方法构建与应用被引量：2