Disambiguating Authors by Pairwise Classification  被引量:1

Disambiguating Authors by Pairwise Classification

在线阅读下载全文

作  者:林泉 王波 杜圆 王雪至 李玉华 陈松灿 

机构地区:[1]Department of Computer Science, Huazhong University of Science and Technology [2]Department of Computer Science, Nanjing University of Aeronautics and Astronautics [3]Department of Computer Science, Tsinghua University

出  处:《Tsinghua Science and Technology》2010年第6期668-677,共10页清华大学学报(自然科学版(英文版)

基  金:supported by the National Natural Science Foundation of China (Nos.70771043,60873225,and 60773191);supported by the National Natural Science Foundation of China (No.60773061);the Natural Science Foundation of Jiangsu Province (No.BK2008381);supported by the National High-Tech Research and Development (863) Program ofChina (No.2009AA01Z138)

摘  要:Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.

关 键 词:disambiguating pairwise classification arnetminer 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象