How Do Pronouns Affect Word Embedding  

How Do Pronouns Affect Word Embedding

在线阅读下载全文

作  者:Tonglee Chung Bin Xu Yongbin Liu Juanzi Li Chunping Ouyang 

机构地区:[1]the Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China [2]the School of Computer Science and Technology, University of South China, Hengyang 421001, China.

出  处:《Tsinghua Science and Technology》2017年第6期586-594,共9页清华大学学报(自然科学版(英文版)

基  金:supported by the National HighTech Research and Development(863)Program(No.2015AA015401);the National Natural Science Foundation of China(Nos.61533018 and 61402220);the State Scholarship Fund of CSC(No.201608430240);the Philosophy and Social Science Foundation of Hunan Province(No.16YBA323);the Scientific Research Fund of Hunan Provincial Education Department(Nos.16C1378 and 14B153)

摘  要:Word embedding has drawn a lot of attention due to its usefulness in many NLP tasks. So far a handful of neural-network based word embedding algorithms have been proposed without considering the effects of pronouns in the training corpus. In this paper, we propose using co-reference resolution to improve the word embedding by extracting better context. We evaluate four word embeddings with considerations of co-reference resolution and compare the quality of word embedding on the task of word analogy and word similarity on multiple data sets.Experiments show that by using co-reference resolution, the word embedding performance in the word analogy task can be improved by around 1.88%. We find that the words that are names of countries are affected the most,which is as expected.Word embedding has drawn a lot of attention due to its usefulness in many NLP tasks. So far a handful of neural-network based word embedding algorithms have been proposed without considering the effects of pronouns in the training corpus. In this paper, we propose using co-reference resolution to improve the word embedding by extracting better context. We evaluate four word embeddings with considerations of co-reference resolution and compare the quality of word embedding on the task of word analogy and word similarity on multiple data sets.Experiments show that by using co-reference resolution, the word embedding performance in the word analogy task can be improved by around 1.88%. We find that the words that are names of countries are affected the most,which is as expected.

关 键 词:word embedding co-reference resolution representation learning 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象