检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡贵超 HU Guichao(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210018)
机构地区:[1]南京理工大学计算机科学与工程学院,南京210018
出 处:《计算机与数字工程》2023年第12期2827-2830,共4页Computer & Digital Engineering
摘 要:提出了一种改进的时延神经网络(Time Delay Neural Network,TDNN)的说话人识别方法以提高说话人识别准确率。首先通过TDNN网络训练音频的特征获取部分说话人的特征表达,然后由加入的量化和计数算子(Quantization and Counting Operators,QCO)同时处理,QCO能够充分利用音频的低层纹理特征,得到特征的细节信息。实验结果表明,改进的时延神经网络在相对较少的数据量中即可由网络训练获取更多信息的特征表达,在小数量训练集网络中体现出明显优势。当数据量进一步增多时效果更为明显,训练加入了纹理统计方法的结构提取的细节特征使说话人识别表现更好。An improved time delay neural network(TDNN)speaker recognition method is proposed to improve the accuracy of speaker recognition.Firstly,the features of audio are trained through TDNN network to obtain the feature expression of some speakers.Then it is processed simultaneously by the added quantization and counting operators(QCO).QCO can make full use of the low-level texture features of audio to obtain the detailed information of features.The experimental results show that the improved time-delay neural network can obtain more information from network training in a relatively small amount of data,it has obvious ad-vantages in the network with a small number of training sets.When the amount of data is further increased,the effect is more obvi-ous.The training adds the texture statistical method to extract the detailed features of the structure,which makes the speaker recog-nition performance better.
关 键 词:说话人识别 时延神经网络 量化和计数算子 qco-vector
分 类 号:O235[理学—运筹学与控制论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.219.241.228