检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张引兵 宋继华[1] 彭炜明[1] 郭冬冬[1] 张曌 宋天宝[1] ZHANG Yin-bing;SONG Ji-hua;PENG Wei-ming;GUO Dong-dong;ZHANG Zhao;SONG Tian-bao(School of Artificial Intelligence,Beijing Normal University,Beijing 100875,China;School of Mathematical Science,Huaibei Normal University,Huaibei 235000,China)
机构地区:[1]北京师范大学人工智能学院,北京100875 [2]淮北师范大学数学科学学院,安徽淮北235000
出 处:《吉林大学学报(工学版)》2020年第6期2212-2220,共9页Journal of Jilin University:Engineering and Technology Edition
基 金:国家自然科学基金项目(61877004);国家社科基金重大项目(18ZDA295);安徽省高等学校自然科学研究重点项目(KJ2019A0592,KJ2020A0023).
摘 要:在对词汇属性进行分析的基础上,结合所构建的词汇构词知识库,以及词汇的“减字类推”、“组合类推”类推机制,基于所给定的特定语料,给出了词汇对相应语料的综合覆盖贡献度评价方案。该方案对词汇相对于语料的重要程度进行了量化表示,为词汇的分级奠定了基础。为了使词汇的学习者最先学习到“更有用”的词汇,词表制定过程中最先收录对于语料综合覆盖贡献度最高的词汇。为了使分级词表的动态生成能够在有限时间内得到求解,使用贪心算法处理词表动态生成过程中的词语收录选择。与已有相关研究相比,本研究具有较强的可解释性和可移植性,可以通过对相关参数的修改对最终生成的词汇及其对应等级进行调整;可以根据需要,适当地加入专家知识进行人工干预,并且实现了词汇等级词表生成的程序化、自动化,为分级词表的生成提供了一种全新的方法,为今后各类词汇大纲的制定及完善提供思路和方法上的参考。Based on the analysis of lexical attributes,combined with the constructed lexical word-formation knowledge base and the lexical analogy mechanism of"subtraction analogy"and"combination analogy",this paper presents an evaluation mechanism of the contribution degree of vocabulary comprehensive cover to the dynamic corpus.The evaluation mechanism gives a quantitative representation of the importance of a word to the corpus,which lays the foundation for vocabulary grading.In order to make learners learn"more useful"vocabulary first,the word with the highest contribution degree of vocabulary comprehensive cover is included first in the process of dynamic generation of the graded word list.In order to make the dynamic generation of the graded word list to be solved in a limited time,greedy algorithm is used to deal with words selection in the process of dynamic generation of the graded word list.Compared with the previous studies,this study has strong explanatory and portability.The final generated words and their corresponding levels can be adjusted by modifying the parameters.In addition,expert knowledge can also be added to intervene manually according to the need.Finally,the program and automation of graded word list generation are realized.This study provides a new method for the generation of graded word list,and provides ideas and methodological references for the formulation and improvement of various vocabulary syllabus in the future.
关 键 词:中文信息处理 国际汉语教学 动态语料 分级词表 词汇综合覆盖贡献度 动态生成
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3