检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:程高峰 颜永红[1,2] CHENG Gao-feng;YAN Yong-hong(Institute of Acoustics,Chinese Academy of Sciences,Beijing 100190,China;School of Electronic,Electrical and Communication Engineering,University of Chinese Academy of Sciences,Beijing 100049,China)
机构地区:[1]中国科学院声学研究所,北京100190 [2]中国科学院大学电子电气与通信工程学院,北京100049
出 处:《计算机科学》2022年第1期47-52,共6页Computer Science
摘 要:随着多媒体信息和通信技术的快速发展,网络上的多语言语音数据日益增多。语音识别作为语音分析与处理的核心技术,如何快速地把中文和英文等少数多资源主要语言处理能力推广到更多的低资源语言,是当前识别技术迫切需要突破的瓶颈。文中试图总结声学模型建模领域的最新进展,探讨传统语音识别技术从单语言向多语言跨越过程中可能面临的困难。并在此基础之上,探索了最新的端到端语音识别技术在关键词检索系统构建上的作用,以进一步改善系统的整体效果。最后总结了如下最新研究进展:1)基于模型参数共享的多语言声学建模;2)基于语种分类信息的多语言声学建模;3)基于帧级别对齐的端到端关键词检索技术。With the rapid development of multimedia and communication technology,the amount of multilingual speech data on the Internet is increasing.Speech recognition technology is the core for media analysis and processing.How to quickly expand from a few major languages such as Chinese and English to more languages has become a prominent issue yet to be overcome in order to improve multilingual processing capabilities.This article summarizes the latest progress in the field of acoustic model modeling,and discusses breakthroughs needed by traditional speech recognition technology in the course of moving from single language to multi-languages.The latest end-to-end speech recognition technology was exploited to construct a keyword spotting system,and the system achieves favorable performance.The approach is detailed as follows:1)multi-lingual hierarchical and structured acoustic model modeling method;2)multilingual acoustic modeling based on language classification information;3)endto-end keyword spotting based on frame-synchronous alignments.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.21.114.165