MDD:A Unified Multimodal Deep Learning Approach for Depression Diagnosis Based on Text and Audio Speech  

在线阅读下载全文

作  者:Farah Mohammad Khulood Mohammed Al Mansoor 

机构地区:[1]Center of Excellence and Information Assurance(CoEIA),King Saud University,Riyadh,11543,Saudi Arabia [2]Department of Computer Science,and Technology,Arab East Colleges,Riyadh,11583,Saudi Arabia [3]Self-Development Skills Department,King Saud University,Riyadh,11543,Saudi Arabia

出  处:《Computers, Materials & Continua》2024年第12期4125-4147,共23页计算机、材料和连续体(英文)

摘  要:Depression is a prevalent mental health issue affecting individuals of all age groups globally.Similar to other mental health disorders,diagnosing depression presents significant challenges for medical practitioners and clinical experts,primarily due to societal stigma and a lack of awareness and acceptance.Although medical interventions such as therapies,medications,and brain stimulation therapy provide hope for treatment,there is still a gap in the efficient detection of depression.Traditional methods,like in-person therapies,are both time-consuming and labor-intensive,emphasizing the necessity for technological assistance,especially through Artificial Intelligence.Alternative to this,in most cases it has been diagnosed through questionnaire-based mental status assessments.However,this method often produces inconsistent and inaccurate results.Additionally,there is currently a lack of a comprehensive diagnostic framework that could be effective achieving accurate and robust diagnostic outcomes.For a considerable time,researchers have sought methods to identify symptoms of depression through individuals’speech and responses,leveraging automation systems and computer technology.This research proposed MDD which composed of multimodal data collection,preprocessing,and feature extraction(utilizing the T5 model for text features and the WaveNet model for speech features).Canonical Correlation Analysis(CCA)is then used to create correlated projections of text and audio features,followed by feature fusion through concatenation.Finally,depression detection is performed using a neural network with a sigmoid output layer.The proposed model achieved remarkable performance,on the Distress Analysis Interview Corpus-Wizard(DAIC-WOZ)dataset,it attained an accuracy of 92.75%,precision of 92.05%,and recall of 92.22%.For the E-DAIC dataset,it achieved an accuracy of 91.74%,precision of 90.35%,and recall of 90.95%.Whereas,on CD-III dataset(Custom Dataset for Depression),the model demonstrated an accuracy of 93.05%,precision of 92

关 键 词:DEPRESSION deep learning T5 WaveNet CCA neural network 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象