Deep Learning Driven Arabic Text to Speech Synthesizer for Visually Challenged People  

在线阅读下载全文

作  者:Mrim M.Alnfiai Nabil Almalki Fahd N.Al-Wesabi Mesfer Alduhayyem Anwer Mustafa Hilal Manar Ahmed Hamza 

机构地区:[1]King Salman Center for Disability Research,Riyadh,13369,Saudi Arabia [2]Department of Information Technology,College of Computers and Information Technology,Taif University,P.O.Box 11099,Taif,21944,Saudi Arabia [3]Department of Special Education,College of Education,King Saud University,Riyadh,12372,Saudi Arabia [4]Department of Computer Science,College of Science&Arts at Muhayel,King Khaled University,Abha,62217,Saudi Arabia [5]Department of Computer Science,College of Sciences and Humanities-Aflaj,Prince Sattam bin Abdulaziz University,Al-Aflaj,16733,Saudi Arabia [6]Department of Computer and Self Development,Preparatory Year Deanship,Prince Sattam bin Abdulaziz University,AlKharj,16242,Saudi Arabia

出  处:《Intelligent Automation & Soft Computing》2023年第6期2639-2652,共14页智能自动化与软计算(英文)

基  金:The authors extend their appreciation to the King Salman center for Disability Research for funding this work through Research Group no KSRG-2022-030.

摘  要:Text-To-Speech(TTS)is a speech processing tool that is highly helpful for visually-challenged people.The TTS tool is applied to transform the texts into human-like sounds.However,it is highly challenging to accomplish the TTS out-comes for the non-diacritized text of the Arabic language since it has multiple unique features and rules.Some special characters like gemination and diacritic signs that correspondingly indicate consonant doubling and short vowels greatly impact the precise pronunciation of the Arabic language.But,such signs are not frequently used in the texts written in the Arabic language since its speakers and readers can guess them from the context itself.In this background,the current research article introduces an Optimal Deep Learning-driven Arab Text-to-Speech Synthesizer(ODLD-ATSS)model to help the visually-challenged people in the Kingdom of Saudi Arabia.The prime aim of the presented ODLD-ATSS model is to convert the text into speech signals for visually-challenged people.To attain this,the presented ODLD-ATSS model initially designs a Gated Recurrent Unit(GRU)-based prediction model for diacritic and gemination signs.Besides,the Buckwalter code is utilized to capture,store and display the Arabic texts.To improve the TSS performance of the GRU method,the Aquila Optimization Algorithm(AOA)is used,which shows the novelty of the work.To illustrate the enhanced performance of the proposed ODLD-ATSS model,further experi-mental analyses were conducted.The proposed model achieved a maximum accu-racy of 96.35%,and the experimental outcomes infer the improved performance of the proposed ODLD-ATSS model over other DL-based TSS models.

关 键 词:Saudi Arabia visually challenged people deep learning Aquila optimizer gated recurrent unit 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论] TP18[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象