检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:班玛宝 才让加[1,2,3,4,5] 张瑞 色差甲 卓玛扎西 BAN Mabao;CAI Rangjia;ZHANG Rui;SE Chajia;ZHUO Mazhaxi(College of Computer Science and Technology,Qinghai Normal University,Xining 810016;The State Key Laboratory of Tibetan Intelligent Information Processing and Application,Xining 810008;Tibetan Information Processing Engineering Technology and Research Center of Qinghai Province,Xining 810008;Tibetan Information Processing and Machine Translation Key Laboratory of Qinghai Province,Xining 810008;Key Laboratory of Tibetan Information Processing,Ministry of Education,Xining 810008)
机构地区:[1]青海师范大学计算机学院,西宁810016 [2]藏语智能信息处理及应用国家重点实验室,西宁810008 [3]青海省藏文信息处理工程技术研究中心,西宁810008 [4]青海省藏文信息处理与机器翻译重点实验室,西宁810008 [5]藏文信息处理教育部重点实验室,西宁810008
出 处:《北京大学学报(自然科学版)》2022年第1期91-98,共8页Acta Scientiarum Naturalium Universitatis Pekinensis
基 金:国家自然科学基金(61662061,61063033,61966031);国家重点研发计划(2017YFB1402200);青海省藏文信息处理与机器翻译重点实验室项目(2020-ZJ-Y05);青海省科技厅项目(2019-SF-129);青海省重点实验室项目(2013-Z-Y17,2014-Z-Y32,2015-Z-Y03)资助。
摘 要:基于藏文La格(■)例句的自动分类在藏语自然语言处理领域的重要性,根据藏文La格的用法和添接规则,在对藏文La格例句进行分类并定义分类概念的基础上,提出一种融合双通道音节特征的藏文La格例句自动分类模型。该模型首先使用word2vec和Glove构建双通道藏文音节嵌入,分别在每路卷积中融合双通道音节特征,丰富输入特征的表达和提高卷积层的空间表征能力;然后在每一路卷积均使用结合层级注意力机制的Bi-LSTM学习时序特征后,拼接多路特征,提高上下文时序特征的学习能力;最后通过全链接层和Softmax层实现藏文La格例句自动分类。实验结果表明,该模型在测试集上的藏文La格例句分类准确率达到90.26%。Based on the importance of automatic classification of Tibetan La case(■)example sentences in Tibetan natural language processing,according to the usage and adding rules of Tibetan La case,this paper classifies Tibetan La case example sentences and defines the classification concept,and proposes an automatic classification model of Tibetan La case example sentences with fusion dual-channel syllable features.The proposed model first uses word2 vec and Glove to construct a dual-channel Tibetan syllable embedding,and combines the dual-channel syllable features in each convolution respectively to enrich the expression of input features and improve the spatial representation ability of the convolutional layer.Then in each convolution,the Bi-LSTM combined with the hierarchical attention mechanism is used to learn the timing features,and the multi-channel features are spliced to improve the learning ability of the context timing features.Finally,the automatic classification of Tibetan La case example sentences is realized through the full link layer and the Softmax layer.Experiments show that proposed model has an accuracy of 90.26% in the classification of Tibetan La case example sentences on the test set.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.52.101