检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]战略支援部队信息工程大学洛阳校区,河南 洛阳
出 处:《计算机科学与应用》2020年第10期1714-1727,共14页Computer Science and Application
摘 要:依据西班牙语词汇和音素的特征以及词汇标音过程的特点,将西班牙语词汇标音过程建模为序列标注任务,提出基于字符嵌入 + 循环神经网络 + 连接时序分类的端到端词汇标音模型。首先利用word2vec框架在自建的西班牙语词库上训练字符嵌入向量,从而形成西班牙语字符的分布式向量编码表示;之后基于循环神经网络和连接时序分类算法构建了西班牙语词汇标音模型,并在自建的发音词典语料上进行了训练与测试。试验结果显示,基于字符嵌入 + 循环神经网络 + 连接时序分类的词汇标音模型可以获得较其他统计模型或是神经网络模型更高的标音准确率,同时较传统标音模型有更简单的标注流程,对数据集的要求也要低得多,可有效实现端到端的西班牙语词汇标音任务。According to the characteristics of these vocabularies and phonemes and the characteristics of the vocabulary transcription process, the word vocabulary transcription process is modeled as a sequence labeling task, and an end-to-end vocabulary transcription model method based on character embedding + recurrent neural network + connection arrangement classification is proposed. First, this paper uses the word2vec framework to train the character embedding vector on the self-built serial thesaurus to form a distributed encoding representation of the character;then based on the recurrent neural network and the connection classification algorithm, a model called vocabulary transcription is constructed. The test results show that the word transcription model of string embedding + cyclic neural network + connection order classification can use higher transcription accuracy than other statistical models or neural network models. At the same time, it has a simpler labeling process than traditional phonetic models. The requirements of the phonetic transcription should also be reduced, that can effectively realize the end-to-end task called phonetic transcription.
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.62