检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:程振 贾嘉敏 蒋作[2] 王欣 CHENG Zhen;JIA Jia-min;JIANG Zuo;WANG Xin(School of Electrical and Information Technology,Yunnan Minzu University,Kunming 650500,China;School of Mathematics and Computer Science,Yunnan Minzu University,Kunming 650500,China)
机构地区:[1]云南民族大学电气信息工程学院,云南昆明650500 [2]云南民族大学数学与计算机科学学院,云南昆明650500
出 处:《云南民族大学学报(自然科学版)》2025年第1期84-92,共9页Journal of Yunnan Minzu University:Natural Sciences Edition
基 金:国家自然科学基金(61866040)。
摘 要:语言交流效率得分是量化口吃严重程度的方法,该方法需要获得口吃发生的时间,但目前相关研究仅能判断语音段中是否存在口吃,无法精确定位口吃的发生位置,不利于对口吃严重程度的判别.针对目前深度学习检测口吃类型无法可视化定位目标的问题,首先使用短时傅里叶变换将语音转化为语谱图,然后对其进行口吃类型标记,最后使用YOLOv5对口吃类型进行检测.在YOLOv5的基础框架下尝试YOLOv5s、YOLOv5m、YOLOv5l、YOLOv5x 4种不同深度和宽度的模型,实现口吃类型的分类和定位,并选择在其性能最优的模型YOLOv5l中引入高效通道注意力机制和CIOU目标框损失函数对基础模型进行改进.实验结果表明,改进的YOLOv5l模型在训练损失值有明显降低,在准确率、召回率和mAP_0.5上分别提升了1.2、0.6和0.4个百分点,较原模型漏检情况有所改善.The language communication efficiency score is a method to quantify the severity of stuttering.This method requires the time when the stuttering occurs.However,current related research can only determine whether there is stuttering in the speech segment,and cannot accurately locate the stuttering,which is not condu-cive to the identification of severity of stuttering.In view of the problem that the current deep learning detection of stuttering type cannot visually locate the target,this paper first uses short-term Fourier transform to convert the speech into a spectrogram,then marks the stuttering type,and finally uses YOLOv5 to detect the stuttering type.Under the basic framework of YOLOv5,four models of different depth and width of YOLOv5s,YOLOv5m,YOLOv5l,and YOLOv5x are tried to realize the classification and positioning of stuttering types,and the efficient attention mechanism and CIOU target box loss function are introduced into with the best performance to improve the basic model.The experimental results show that the improved YOLOv5l model has a significant reduction in the training loss value,and the accuracy,recall and mAP_0.5 are increased by 1.2,0.6 and 0.4 percentage point respectively,which is an improvement compared with the miss detection of the original model.
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15