Improving Speech Enhancement Framework via Deep Learning  

在线阅读下载全文

作  者:Sung-Jung Hsiao Wen-Tsai Sung 

机构地区:[1]Department of Information Technology,Takming University of Science and Technology,Taipei City,11451,Taiwan [2]Department of Electrical Engineering,National Chin-Yi University of Technology,Taichung,411030,Taiwan

出  处:《Computers, Materials & Continua》2023年第5期3817-3832,共16页计算机、材料和连续体(英文)

基  金:This research was supported by the Department of Electrical Engineering at National Chin-Yi University of Technology.The authors would like to thank the National Chin-Yi University of Technology,TakmingUniversity of Science and Technology,Taiwan,for supporting this research.

摘  要:Speech plays an extremely important role in social activities.Many individuals suffer from a“speech barrier,”which limits their communication with others.In this study,an improved speech recognitionmethod is proposed that addresses the needs of speech-impaired and deaf individuals.A basic improved connectionist temporal classification convolutional neural network(CTC-CNN)architecture acoustic model was constructed by combining a speech database with a deep neural network.Acoustic sensors were used to convert the collected voice signals into text or corresponding voice signals to improve communication.The method can be extended to modern artificial intelligence techniques,with multiple applications such as meeting minutes,medical reports,and verbatim records for cars,sales,etc.For experiments,a modified CTC-CNN was used to train an acoustic model,which showed better performance than the earlier common algorithms.Thus a CTC-CNN baseline acoustic model was constructed and optimized,which reduced the error rate to about 18%and improved the accuracy rate.

关 键 词:Artificial intelligence speech recognition speech to text CTC-CNN 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象