检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:冯伟 易绵竹 马延周 FENG Wei;YI Mianzhu;MA Yanzhou(The PLA Strategic Support Force Information Engineering University Luoyang Campus, Luoyang, Henan 471003, China)
机构地区:[1]战略支援部队信息工程大学洛阳校区,河南洛阳471003
出 处:《中文信息学报》2018年第2期87-93,101,共8页Journal of Chinese Information Processing
基 金:洛阳市社会科学规划项目(2016B285)
摘 要:在俄语语音信息处理的资源建设中,字音转换技术起到了至关重要的作用。该文尝试对基于SAMPA的俄语音素集进行改进设计,使标音结果能够反映俄语单词的重音位置及元音弱化现象。依据改进的新音素集构建了包含20 000词的俄语发音词典。在此基础上,实现了一种数据驱动的俄语字音转换算法,将加权有限状态转化器(WFST)应用于算法的对齐、建模和解码过程中。首先利用期望最大化算法以"多对多"的方式对俄语字音进行对齐,然后将对齐结果通过联合N-gram模型训练,并转化为WFST发音模型,最后通过WFST解码算法对任意单词的发音进行预测。交叉验证实验结果表明,平均词形正确率为62.9%,平均音素正确率为92.2%。Grapheme-to-phoneme conversion(G2P) plays a very important role in the resources construction of Russian speech information processing.This paper attempts to improve and design a new Russian phoneme set based on SAMPA,enabling the transcription results to reflect the stress position and vowel reduction of Russian words.After constructing the 20,000-word Russian pronunciation dictionary according to the new phoneme set,this paper implements a data-driven Russian G2P algorithm,emloying the Weighted Finite-State Transducer(WFST) for alignment,model building and decoding.First,the "multiple-to-multiple"alignment algorithm based on Expectation Maximization algorithm is applied to Russian grapheme and phoneme sequences.Then,the joint N-gram model is trained based on the alignment result and converted into WFST as pronunciation model.Finally,the pronunciation of a novel input word can be predicted through WFST decoding algorithm.In cross-validation experiments,the average word accuracy is 62.9%,and the average phoneme accuracy is 92.2%.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.158