基于节点词全句共现的动态词义消歧研究  

A Study of Dynamic Word Sense Disambiguation Based on Full-sentence Co-occurrence of Node Word

在线阅读下载全文

作  者:闫亚亚 邢红兵[2] Yan Yaya;Xing Hongbing(College of Chinese Language and Culture,Jinan University,Guangzhou Guangdong 510610;Institute on Educational Policy and Evaluation of International Students,Beijing Languageand Culture University,Beijing 100083)

机构地区:[1]暨南大学华文学院,广东广州510610 [2]北京语言大学国际学生教育政策与评价研究院,北京100083

出  处:《语言科学》2024年第4期354-364,共11页Linguistic Sciences

基  金:国家自然科学基金项目(32271091);教育部中外语言合作交流中心2022年国际中文教育研究课题青年项目(22YH69D)阶段性成果。

摘  要:文章根据词义消歧即将词义回归语境这一特性,提出了一种基于节点词全句共现的动态词义消歧方法。该方法首先以全句为窗口限定节点词的使用语境,其次使用互信息(MI)、卡方检验(χ^(2)检验)和相对词序比(RRWR)等统计方法抽取节点词的语义相关词,并参照《同义词词林》构建相关词语义范畴库,最后以共现频数作为加权系数,依靠单义词语义聚类分布率对中低频共现多义词进行消歧。采用该方法对与“美丽”共现的1030个小于7义类的多义词进行消歧的测试试验中取得了85.2%的正确率。Based on the property that word sense disambiguation is the return of word sense to context,we propose a dynamic word sense disambiguation method based on full-sentence co-occurrence of node word.The method firstly uses the full sentence as a window to limit the node word usage context,secondly uses statistical methods such as mutual information,chi-square test and ratio of relative word rank to extract semantically related words,and builds a related semantic category database by referring to“Tongyici Cilin”(A Dictionary of Synonyms),and finally uses the co-occurrence frequency as a weighting factor to disambiguate the low and medium frequency co-occurring multisense words by relying on the distribution rate of single-sense word meaning clusters.The method is used to disambiguate 1030multiple-meaning words with less than 7meaning categories that co-occurred with“meili”(beautiful),and a correct rate of 85.2%is achieved in the test.

关 键 词:节点词 全句共现 词义消歧 语义聚类 无指导学习 

分 类 号:H08[语言文字—语言学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象