基于循环神经网络的西班牙语词汇发音预测模型研究  

Research on Predictive Model of Spanish Vocabulary Pronunciation Based on Recurrent Neural Network

在线阅读下载全文

作  者:赵皎谷 马延周 黄晓辉 

机构地区:[1]战略支援部队信息工程大学洛阳校区,河南 洛阳

出  处:《计算机科学与应用》2020年第10期1714-1727,共14页Computer Science and Application

摘  要:依据西班牙语词汇和音素的特征以及词汇标音过程的特点,将西班牙语词汇标音过程建模为序列标注任务,提出基于字符嵌入 + 循环神经网络 + 连接时序分类的端到端词汇标音模型。首先利用word2vec框架在自建的西班牙语词库上训练字符嵌入向量,从而形成西班牙语字符的分布式向量编码表示;之后基于循环神经网络和连接时序分类算法构建了西班牙语词汇标音模型,并在自建的发音词典语料上进行了训练与测试。试验结果显示,基于字符嵌入 + 循环神经网络 + 连接时序分类的词汇标音模型可以获得较其他统计模型或是神经网络模型更高的标音准确率,同时较传统标音模型有更简单的标注流程,对数据集的要求也要低得多,可有效实现端到端的西班牙语词汇标音任务。According to the characteristics of these vocabularies and phonemes and the characteristics of the vocabulary transcription process, the word vocabulary transcription process is modeled as a sequence labeling task, and an end-to-end vocabulary transcription model method based on character embedding + recurrent neural network + connection arrangement classification is proposed. First, this paper uses the word2vec framework to train the character embedding vector on the self-built serial thesaurus to form a distributed encoding representation of the character;then based on the recurrent neural network and the connection classification algorithm, a model called vocabulary transcription is constructed. The test results show that the word transcription model of string embedding + cyclic neural network + connection order classification can use higher transcription accuracy than other statistical models or neural network models. At the same time, it has a simpler labeling process than traditional phonetic models. The requirements of the phonetic transcription should also be reduced, that can effectively realize the end-to-end task called phonetic transcription.

关 键 词:西班牙语 发音词典 字音转换 循环神经网络 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象