检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:才让东知 祁坤钰[1,2] 贡保杰布 DongzhiTsering;QI Kun-yu;Gongbaojiebu(Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education,Northwest Minzu University,Lanzhou 730030,China;Gansu Provincial Key Laboratory of intelligent processing of national languages,Northwest Minzu University,Lanzhou 730030,China;School of Computer Science,Qinghai Normal University,Xining 810000,China)
机构地区:[1]西北民族大学甘肃省民族语言智能处理重点实验室,甘肃兰州730030 [2]西北民族大学中国民族语言文字信息技术教育部重点实验室,甘肃兰州730030 [3]青海师范大学计算机学院,青海西宁810000
出 处:《西北民族大学学报(自然科学版)》2023年第3期16-24,共9页Journal of Northwest Minzu University(Natural Science)
基 金:国家自然科学基金项目“面向长序列的文档级神经机器翻译关键技术研究”(62266038)。
摘 要:针对藏文词汇资源匮乏和词汇分级模糊等问题,采用词典语料和词性标注语料相结合的方法,设计了藏文单音节单纯词抽取模型,规划了详细的技术方案,构建了比较完整的词典语料库,获得了藏文单音节单纯词的分类词表,依据相对通用度得到了分级词表,其中名词、动词、形容词、副词和数词等单音节单纯词总数1414条,词性之间存在大量的兼类现象,对汉藏语言资源库建设具有重要意义.In this thesis,a dictionary corpus and a lexical annotation corpus were combined to design a Tibetan monosyllabic monomorphemic words extraction model,plan a detailed technical scheme,and construct a relatively complete dictionary corpus to address the lack of Tibetan lexical resources and the ambiguity of lexical grading.A classification list of Tibetan monosyllabic monomorphemic words were obtained,and a graded word list was obtained based on the relative generality,in which the total number of monosyllabic monomorphemic words such as nouns,verbs,adjectives,adverbs and numerals were 1414,and there were a large number of parthenogenesis between words.It is of great significance to the construction of Sino-Tibetan language resource base.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49