检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王瑞云[1] 贾君枝[1] Wang Ruiyun Jia Junzhi
机构地区:[1]山西大学经济与管理学院
出 处:《国家图书馆学刊》2017年第2期79-86,共8页Journal of The National Library of China
基 金:国家社科基金重点项目"基于关联数据的中文名称规范档语义描述及数据聚合研究"(项目编号:15ATQ004)的研究成果之一
摘 要:本文尝试解决国内个人名称规范联合数据库检索结果集基于实体匹配的聚簇问题,分析国内名称规范联合库CCCNA的检索服务和数据库记录特点,提出对结果集记录合并聚簇的思路:首先预处理去除重复和明显的名称语义不匹配记录,再根据提取出的个人实体属性名称、出生年、个人关联的书目题名及关联的外部记录,基于个人实体的语义进行个人名称规范记录聚簇。实证统计结果显示,处理后结果集内的簇数都显著低于处理前的记录条数,与VIAF的关联聚簇结果也验证了本文方法的有效性。但本文书目匹配采取题名匹配,这会丢失一些有用的聚簇信息,后续研究将进一步集成图书机构的书目数据库,抽取更多的书目信息进行聚簇。This paper tries to deal with entity-based matching and clustering of retrieval result sets of Chinese personal name database. This paper analyses retrieval service of Cooperation Committee of Chinese Name Au- thory(CCCNA) and record features in the database and concludes that each retrieval result set has too many re- cords needing to cluster based entity from semantics views. It first proposes preproeessing removing records of repeated and obvious mismatching between name and semantics, then proposes records clustering method based on names, birth-years, linked controlled numbers and books titles and links result clusters to VIAF. The results show that the quantity of clusters after processing is notably less than records before processing and the empirical study of linking into VIAF confirms the effectiveness of the methods. The book title is only taken as assistance of identifying of personal entity in the same person name recordings because of different reference for- mats and multilanguages from different references and we shall integrate bibliographic databases to present books based semantic entity in future research. 4 figs. 6 tabs. 16 refs.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.179.141