基于分布式表示的汉字部件表义能力测量与应用  被引量:4

Measurement and Application of Chinese Component Semantic Ability Based on Distributed Representation

在线阅读下载全文

作  者:梁诗尘 唐雪梅 胡韧奋[1] 吴金闪 刘智颖[1] LIANG Shichen;TANG Xuemei;HU Renfen;WU Jinshan;LIU Zhiying(Institute of Chinese Information Processing,Beijing Normal University,Beijing 100875,China;School of Systems Science,Beijing Normal University,Beijing 100875,China)

机构地区:[1]北京师范大学中文信息处理研究所,北京100875 [2]北京师范大学系统科学学院,北京100875

出  处:《中文信息学报》2021年第5期17-26,共10页Journal of Chinese Information Processing

基  金:国家语委科研项目(ZDI135-42);国家社会科学基金(18CYY029);教育部人文社会科学基金(18YJAZH112)。

摘  要:汉字的表义性是其区别于表音文字的一大特点。部件作为构字单位,同汉字的意义之间有着很大的联系。然而,汉字部件的表义能力究竟如何是学界尚待讨论的课题。针对这一问题,该文从汉字部件入手,提出了融合部件的字词分布式表示模型。该模型在向量内部评测任务上性能获得了一定提升,在汉字理据性测量任务上也与人工打分结果显著相关。基于该模型,进一步提出了部件表义能力的计算方法,对汉字部件的表义能力做了整体评估,并结合部件的构字能力建立了现代汉字部件的等级体系。测量结果显示,现代汉字部件具有一定表义能力,但整体而言表义能力偏低。最后,将测量结果应用于对外汉语教学中,确立了适用于部件教学法的部件范围,并提出了对应的汉字教学顺序方案。The semantic representation of Chinese characters is one of the characteristics that distinguishes them from phonetic characters. As a unit of character construction, components are closely related to the meaning of Chinese characters. However, how to measure the meaning of Chinese character components is an issue remains to be discussed. In this paper, we focus on components in Chinese character and train a multi-granularity Chinese word embedding, which are proved positive in the internal evaluation task of word embedding and the motivation mea-surement of Chinese character. Based on this model, we further put forward a formula to calculate the semantic ability of components, revealing that components in Chinese characters have certain but limited semantic ability. Meanwhile, we further establish the grading system of components by taking the semantic ability of components into account. Finally, for the teaching of Chinese as a foreign language, We establish the scope of component teaching, and put forward a scheme of teaching sequence of Chinese characters.

关 键 词:汉字部件 表义能力测量 分布式表示 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象