检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:阮光册 涂世文 田欣 张莉 Ruan Guangce;Tu Shiwen;Tian Xin;Zhang Li(Department of Information Management in Faculty of Economics and Management,East China Normal University,Shanghai 200241;Shanghai Technology Development Co.,Ltd,Shanghai 200235)
机构地区:[1]华东师范大学经济与管理学部信息管理系,上海200241 [2]上海科技发展有限公司,上海200235
出 处:《情报杂志》2021年第9期147-153,共7页Journal of Intelligence
基 金:上海市经信委项目“上海人工智能公共研发资源图谱”(编码:XX-RGZN-01-19-5037)。
摘 要:[目的/意义]英文作者重名现象十分普遍,为解决科技文献增量式人名消歧问题,以提高学术检索平台作者检索的精度。[方法/过程]提出一种融合文献外部基本特征和内部语义特征的人名消歧方法,解决新增英文学术文献作者归属的问题。首先,提取学术文献中人名消歧所需的元数据字段,采用BERT模型对元数据中包含语义信息的文本内容进行向量表示;随后,将融合多特征的数据输入XGBoost,完成机器学习;最后,用学习好的模型实现新增文献的作者分配。[结果/结论]通过实验对比,该方法表现出较好的效果,F1取得了95.6%的分值。[Purpose/Significance]Since the phenomenon of duplicate names of English authors is very common,in order to solve the problem of incremental name disambiguation in scientific and technological literature,and improve the accuracy of author retrieval in academic retrieval platform.[Method/Process]This paper proposes a method of name disambiguation,which combines the external basic features and internal semantic features of the literature,to solve the problem of the author's attribution of the newly added English academic literature.Firstly,this paper extracts the metadata fields needed for person name disambiguation in academic literature,and uses the Bert model to represent the text content containing semantic information in the metadata vector;then,the data fused with multiple features is input into XGBoost to complete machine learning;finally,the author assignment of new literature is realized by using the learned model.[Result/Conclusion]Through the experimental comparison,this method shows good results,F1 achieved 95.6%of the score.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.142.237.71