基于深度学习语音分析的双相障碍患者情绪时相检测  

Emotional time-based detection of patients with bipolar disorder based on deep learning speech analysis

在线阅读下载全文

作  者:李志营[1] 纪俊 周书喆 李嘉琪 李欣慧 冯超南 管丽丽[1] 马灶晖 马燕桃[1] Li Zhiying;Ji Jun;Zhou Shuzhe;Li Jiaqi;Li Xinhui;Feng Chaonan;Guan Lili;Ma Zaohui;Ma Yantao(Clinical Research Division,Peking University Sixth Hospital,Peking University Institute of Mental Health,NHC Key Laboratory of Mental Health(Peking University),National Clinical Research Center for Mental Disorders(Peking University Sixth Hospital),Beijing 100191,China;College of Computer Science and Technology,Qingdao University,Qingdao 266071,China;Department of Psychology,Queen′s University,Ontario,K7L 3N6,Canada;College of Electronic and Electrical Engineering,University of London,London N16AT,UK;Beijing Wanling Pangu Science and Technology Ltd.,Beijing 100080,China;Zhongshan school of medicine,Guangzhou 510080,China)

机构地区:[1]北京大学第六医院临床精神病学研究室、北京大学精神卫生研究所、国家卫生健康委员会精神卫生学重点实验室(北京大学)国家精神心理疾病临床医学研究中心(北京大学第六医院),北京100191 [2]青岛大学计算机科学技术学院,青岛266071 [3]女王大学心理系,加拿大安大略省金斯顿市K7L 3N6 [4]伦敦大学电子电器工程学院,英国伦敦N16AT [5]北京万灵盘古科技有限公司,北京100080 [6]中山医学院,广州510080

出  处:《中华精神科杂志》2024年第4期207-212,共6页Chinese Journal of Psychiatry

基  金:北京市科委首都发展专项基金(2018-2-4112);北京市科委首都临床特色应用研究与成果推广重大项目(Z171100001017086)。

摘  要:目的利用基于语音的深度学习方法区分双相障碍患者抑郁和躁狂情绪时相。方法选取于2018年6月至2022年3月就诊于北京大学第六医院精神科门诊的双相障碍患者61例,使用抑郁症状快速筛查量表、心境障碍问卷和杨氏躁狂量表评估患者的情绪时相。收集所有患者的语音,缓解期、抑郁情绪和躁狂情绪各190条。使用Python中的语音分析库提取语音中的梅尔倒谱系数、过零率等136个特征,通过类LIGHT-SERNET网络训练模型检测情绪时相。采用准确度评估模型整体性能,使用敏感度、特异度、阳性预测值(positive predictive value,PPV)、阴性预测值(negative predictive value,NPV)、受试者工作特征(receiver operation characteristic,ROC)曲线评估模型对3种情绪时相的预测结果。不同情绪时相人口统计学信息比较采用Kruskal-Wallis H检验或χ^(2)检验。结果双相障碍3种情绪时相患者年龄(H=25.83,P<0.001)、受教育年限(H=25.25,P<0.001)和婚姻状况(χ^(2)=23.81,P<0.001)差异均有统计学意义,性别差异无统计学意义(χ^(2)=4.63,P=0.099)。类LIGHT-SERNET模型对3种情绪时相检测的准确度为0.84,其中对缓解期的敏感度为0.88,特异度为0.93,PPV为0.87,NPV为0.94;对抑郁情绪的敏感度为0.82,特异度为0.92,PPV为0.84,NPV为0.92;对躁狂情绪的敏感度为0.82,特异度为0.91,PPV为0.83,NPV为0.91。模型对3种语音情绪时相检测的ROC曲线面积值相近,均在0.90以上。结论通过类LIGHT-SERNET网络对语音进行深度学习分析建立的模型对双相障碍抑郁和躁狂情绪时相具有较好的区分度。Objective To utilize a deep learning approach based on speech to distinguish between depressive and manic mood states in patients with bipolar disorder(BD).Methods Sixty-one BD patients who visited the outpatient department of psychiatry at Peking University Sixth Hospital were recruited to participate in the study from June 2018 to March 2022.Quick Inventory of Depressive Symptomatology,Mood Disorder Questionnaire and Young Mania Rating Scale were used to determine patients′mood states.The voices of the patients were recorded,including 190 samples during the patient′s remission,depressive,and manic mood period respectively.A total of 136 features were extracted from the voice samples,including Mel-frequency cepstral coefficients and zero-crossing rates using the speech analysis library in Python.A LIGHT-SERNET-based network was then used to train a model for emotion classification.Accuracy is used to evaluate the performance of the model,using sensitivity,specificity,positive predictive value(PPV),negative predictive value(NPV),and receiver operating characteristic curve(ROC)to evaluate the predictive results of model for three mood states.Kruskal-Wallis H tests orχ^(2)tests were conducted to compare the differences among the demographic information of three groups.Results There were statistically significant differences among the three groups in age(H=25.83,P<0.001),years of education(H=25.25,P<0.001)and marital status(χ^(2)=23.81,P<0.001).There is no significant difference in gender(χ^(2)=4.63,P=0.099).The accuracy of the model in detecting the three emotional states was 0.84.The sensitivity and specificity in detecting remission were 0.88 and 0.93,respectively,and the positive predictive value and negative predictive value were 0.87 and 0.94,respectively.The sensitivity and specificity in detecting depressive episodes were 0.82 and 0.92,respectively,and the positive predictive value and negative predictive value were 0.84 and 0.92,respectively.The sensitivity and specificity in detecting manic episodes

关 键 词:双相情感障碍 语音 情绪时相 深度学习 类LIGHT-SERNET网络 

分 类 号:R749.4[医药卫生—神经病学与精神病学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象