采用Seq2Seq模型的非受限词义消歧方法  被引量:5

Unrestricted word sense disambiguation method using Seq2Seq model

在线阅读下载全文

作  者:唐善成[1] 马付玉 张镤月 陈熊熊 TANG Shancheng;MA Fuyu;ZHANG Puyue;CHEN Xiongxiong(School of Communication and Information Engineering, Xi′an University of Science and Technology, Xi′an 710054, China)

机构地区:[1]西安科技大学通信与信息工程学院

出  处:《西北大学学报(自然科学版)》2019年第3期351-355,共5页Journal of Northwest University(Natural Science Edition)

基  金:陕西省重点研发计划资助项目(2018GY-151);国家重点研发计划资助项目(2018YFC0808300);西安市科技计划资助项目(201805036YD14CG20(4))

摘  要:词义消歧在中文自然语言处理中有着重要作用,基于传统机器学习的方法存在准确度不高,需要人工提取文本特征的缺点;基于深度学习的方法不适于词义歧义较多的情况。该文提出采用Seq2Seq模型的非受限词义消歧方法,输入词上下文序列,经过编码器编码得到潜在语义向量,再经过解码器解码输出词义序列,适用于所有词义歧义情况。最后,在SemEval-2007 Task#5任务中进行测试,测试结果表明,该文提出的方法比其他7种方法中的最优方法消歧准确率提高了11.48%。Word sense disambiguation plays an important role in Chinese natural language processing. Existing methods based on traditional machine learning have the disadvantages of low accuracy and need to extract text features manually. Existing methods based on deep learning are not suitable for situations where the meaning of words is ambiguous. An unrestricted word sense disambiguation method using Seq2 Seq model is proposed. The input is a word context sequence. The potential semantic vector is obtained by encoder coding. The latent semantic vector is decoded by the decoder to output a sequence of word meanings. The method is applicable to all word meaning ambiguity cases. Finally, the test is carried out in the SemEval-2007 Task #5 task. The test results show that the proposed method has improved the disambiguation accuracy by 11.48% compared with the other seven methods.

关 键 词:自然语言处理 词义消歧 Seq2Seq 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象