检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:沈喆 王毅[1] 鞠秀芳 成颖[1] Shen Zhe;Wang Yi;Ju Xiufang;Cheng Ying(School of Information Management,Nanjing University,Nanjing 210023;Institute for Chinese Social Sciences Research and Assessment,Nanjing University,Nanjing 210093)
机构地区:[1]南京大学信息管理学院,南京210023 [2]南京大学中国社会科学研究评价中心,南京210093
出 处:《情报学报》2022年第4期350-363,共14页Journal of the China Society for Scientific and Technical Information
基 金:国家社会科学基金项目“学术文献颠覆性创新评价的理论及实证研究”(20BTQ086)。
摘 要:学者完整且准确的学术成果集为科学计量与科研人才评价等研究提供了重要的数据基础。在现有基于机器学习模型的作者姓名消歧方法尚未达到实用要求的背景下,本研究面向高层次科研人才,充分利用基于规则方法精确率高的优势,提出了“先面向精确率,后面向召回率”的“两步法”作者姓名消歧模型。得益于该群体易于从网络中搜集其履历、研究方向和代表作等信息,消歧模型可采用的特征更加丰富,从而保证了消歧模型的优异性能。本研究以国家杰出青年科学基金获得者为例对模型进行了验证,结果表明,本研究提出的高层次科研人才作者名消歧模型在精确率与召回率两个方面均表现良好,在两组不同特征集上的F1值分别达到了0.93和0.95,较基线模型有较大提升。Collecting the complete and accurate academic output of each scholar provides the fundamental data needed for bibliometrics and scientific evaluation research. Since the existing author name disambiguation(AND) techniques have not met the demand of practical application, this paper proposes a two-step AND model based on rules for high-level scientific talents that takes full advantage of a rule-based model with high precision and adopts a strategy of targeting precision before recall. Since more features were used due to the feasibility of collecting external data of high-level researchers that contain resumes, representative work, and research interests, the proposed method showed excellent performance. The method was tested with data from the National Science Fund for Distinguished Young Scholars. The experimental results showed that the proposed method performed well both in precision and recall. The F1 score was 0.93 and 0.95 based on two feature sets that were obviously better than the baseline model.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222