机构地区:[1]福建中医药大学中医学院,福建福州350122 [2]福建省中医健康状态辨识重点实验室,福建福州350122
出 处:《南方医科大学学报》2025年第4期711-717,共7页Journal of Southern Medical University
基 金:福建省自然科学基金(2022J01361);福建中医药大学基础学科提升项目(XJC2023004)。
摘 要:目的分析阈下抑郁组和正常组的语音情绪特征,并通过6种机器学习算法构建语音识别分类模型,为阈下抑郁辨识提供客观化依据,以提高早期诊断率。方法采集正常组和阈下抑郁组的朗读单词和文本的不同语音数据,每个语音段提取384维语音情绪特征变量,包括能量特征、梅尔频率倒谱系数、零交叉率特征、声音概率特征、基频特征、差分特征等多个维度。采用递归特征消除方法筛选语音特征变量,然后利用自适应增强算法(AdaBoost)、随机森林(RF)、线性判别分析(LDA)、逻辑回归、Lasso回归和支持向量机机器学习算法构建分类模型,并评估模型的性能。为评估模型泛化能力,采用真实世界的语音数据,对最佳阈下抑郁语音识别分类模型进行测试。结果AdaBoost、RF和LDA模型在单词朗读语音测试集上预测准确率为100%、100%和93.3%,展现出高准确率和稳定性;在单词文本语音测试集上,AdaBoost、RF和LDA模型的预测准确率为90%、80%和90%,其余3个算法模型的准确率均小于80%。阈下抑郁语音AdaBoost和RF分类模型对真实世界的朗读单词和文本语音数据的预测准确率仍然可以达到了91.7%和80.6%,86.1%和77.8%。结论通过分析语音情绪特征可以有效地识别阈下抑郁个体,AdaBoost和RF模型在阈下抑郁个体分类方面表现出色,是识别阈下抑郁的有力工具,可以为临床应用和研究提供参考。Objective To construct vocal recognition classification models using 6 machine learning algorithms and vocal emotional characteristics of individuals with subthreshold depression to facilitate early identification of subthreshold depression.Methods We collected voice data from both normal individuals and participants with subthreshold depression by asking them to read specifically chosen words and texts.From each voice sample,384-dimensional vocal emotional feature variables were extracted,including energy feature,Meir frequency cepstrum coefficient,zero cross rate feature,sound probability feature,fundamental frequency feature,difference feature.The Recursive Feature Elimination(RFE)method was employed to select voice feature variables.Classification models were then built using the machine learning algorithms Adaptive Boosting(AdaBoost),Random Forest(RF),Linear Discriminant Analysis(LDA),Logistic Regression(LR),Lasso Regression(LRLasso),and Support Vector Machine(SVM),and the performance of these models was evaluated.To assess generalization capability of the models,we used real-world speech data to evaluate the best speech recognition classification model.Results The AdaBoost,RF,and LDA models achieved high prediction accuracies of 100%,100%,and 93.3%on wordreading speech test set,respectively.In the text-reading speech test set,the accuracies of the AdaBoost,RF,and LDA models were 90%,80%,and 90%,respectively,while the accuracies of the other 3 models were all below 80%.On real-world wordreading and text-reading speech data,the classification models using AdaBoost and Random Forest still achieved high predictive accuracies(91.7%and 80.6%for AdaBoost and 86.1%and 77.8%for Random,respectively).Conclusion Analyzing vocal emotional characteristics allows effective identification of individuals with subthreshold depression.The AdaBoost and RF models show excellent performance for classifying subthreshold depression individuals,and may thus potentially offer valuable assistance in the clinical and research settings
关 键 词:阈下抑郁识别 语音情绪特征 机器学习 自适应增强算法 随机森林
分 类 号:R749.4[医药卫生—神经病学与精神病学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...