外文数据库英译中文作者姓名消歧实践  

Practice of Author Name Disambiguation in Chinese-English Translation of Foreign Language Database

在线阅读下载全文

作  者:朱玉强[1] 江涛[2] 李翼飞 ZHU YuQiang;JIANG Tao;LI YiFei(Library of Shandong Normal University,Jinan 250014,P.R.China;Library of Hainan Medical University,Haikou 571199,P.R.China)

机构地区:[1]山东师范大学图书馆,济南250014 [2]海南医学院图书馆,海口571199

出  处:《数字图书馆论坛》2022年第2期33-39,共7页Digital Library Forum

基  金:2021年度海南省哲学社会科学规划课题(编号:hnsz2021-19)资助。

摘  要:针对外文数据库英译中文作者姓名存在多记录指向同一人或同记录指向不同人等情况,模拟人工排检法,整合多源数据、学术社交网络、知识百科及在线翻译网站等语料库,利用网页文档对象自动操作、正则表达式、短文本相似度计算等技术编制程序开展英译中文作者姓名消歧实践。结果表明,算法架构稳定有效、扩展性强,成功率得到从业人员认可,为数据预处理和清洗工作提供了新思路和新方法。Aiming at the situation where there are multiple records of the names and addresses of the authors in English-to-Chinese translations of foreign language databases pointing to the same person or the same records pointing to different people,the article simulates manual sorting,integrating multi-source data,academic social networks,knowledge encyclopedias,and online translation websites and other corpora.Use the automatic operation of web document objects,regular expressions,short text similarity calculation and other technologies to compile programs to carry out the practice of disambiguation from English to Chinese name and address.The results show that the algorithm architecture is stable and effective,with strong scalability,and the success rate is recognized by practitioners.It provides new ideas and new methods for data preprocessing and cleaning.

关 键 词:姓名消歧 地址消歧 数据治理 外文数据库 

分 类 号:G255.1[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象