检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:崔静静 胡泽文 任萍 CUI Jing-jing;HU Ze-wen;REN Ping(School of Management Science and Engineering,Nanjing University of Information Science&Technology,Nanjing 210044,China)
机构地区:[1]南京信息工程大学管理工程学院,江苏南京210044
出 处:《情报科学》2022年第5期90-96,共7页Information Science
基 金:国家社会科学基金项目“面向海量科技文献的潜在‘精品’识别方法与应用研究”(20CTQ031)。
摘 要:【目的/意义】海量科技文献中存在大量潜在“精品”文献,如何识别并利用此类文献是目前较具现实意义的研究问题。【方法/过程】本文以Web of Science数据库中人工智能领域1990-2010年期间的文献原文及引文数据为样本,构建该领域文献原文-引文特征向量空间,融合决策树和逻辑回归模型对文献特征向量空间进行模型训练和潜在“精品”论文识别的测试应用。【结果/结论】实验结果表明,“发表五年后被引量”特征变量的加入能够显著提升决策树和逻辑回归模型的识别分类效果,使得两类模型的识别准确率分别达到84%和89%以上,提升幅度达到20多个百分点。逻辑回归模型的识别效果始终优于决策树模型,通过调整两种模型的超参数,能够使得模型获得更理想的识别效果。此外,早期人工智能领域科学研究仍处于小团队协作阶段,领域文献的基金支持和开放获取程度较低。【创新/局限】尽管论文创新性引入机器学习方法实现潜在“精品”文献识别模型的建模与应用,然而仍需将模型拓展到更多学科领域。【Purpose/significance】There are a large number of excellent papers in the scientific literature that have not been found.Identifying and making use of these excellent papers have important practical significance at present.【Method/process】In this study,we use the 1990-2010 original and citation literature data in the field of artificial intelligence from the Web of Science database to construct the original paper-citation feature vector space,and use the decision tree and logistic regression for model training and testing.【Result/conclusion】The result shows that the indicator of"citations during five years after publication"can significantly improve the recognition effect of decision trees and logistic regression,making the accuracy of the two models reach 84%and 89%respectively,and the increase rate reached more than 20%.The recognition effect of the logistic regression is always better than that of the decision tree.By adjusting the hyperparameters of the two models,the model can obtain a better recognition effect.In addition,early scientific research in the field of artificial intelligence is still in the stage of small team collaboration,and the degree of funding and open access to this field literature is low.【Innovation/limitation】We innovatively introduce machine learning methods to realize the recognition models of“hidden treasures”among massive literature.However,we need apply these recognition models into more disciplines.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117