检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李威[1] 杨继臣[1] 贺前华[1] 李艳雄[1]
机构地区:[1]华南理工大学电子与信息学院,广东广州510640
出 处:《华中科技大学学报(自然科学版)》2015年第7期62-65,共4页Journal of Huazhong University of Science and Technology(Natural Science Edition)
基 金:国家自然科学基金资助项目(61301300);中国博士后科学基金资助项目(2013M531850);中央高校基本科研业务费资助项目(2013ZM0097)
摘 要:为了解决浅层特征不能有效刻画说话人特征,导致说话人检索率不高的问题,提出了一种基于深层说话人矢量的说话人检索方法.使用受限波尔兹曼机逐层构建一个多层的深层特征提取器用以提取说话人深层特征.为说话人构建基于深层特征的深层说话人矢量.通过计算要检索的说话人的深层说话人矢量和检索库中的说话人深层特征之间的最小距离,对目标说话人进行检索.实验结果表明:在深层特征下,使用深层说话人矢量可以检索到绝大部分的目标说话人;随着深度层数的增加,检索率先增后减,检索率最高对应的深度层数是7;随着深度层数的增加,检索时间非线性增加.In order to solve the problem that shallow feature can not depict speakers effectively,spearker reterieval rate is low,a method of speaker retrieval was proposed based on deep speaker vectors.Firstly,a multi layers deep feature extractor was constructed by using restriced boltzmann machines(RBM)training layer by layer to extract speaker deep feature.Secondly,deep speaker vectors were built.Lastly,object speaker was retrieved by calculating the minimal distance between deep speaker vectors of retrieval speaker and deep feature of speakers in retrieval library.Experimental results demonstrate that under deep feature,most of speakers can be retrieved using deep speaker vectors.Retrieval rate of the first and second layer are lower than mel-frequency cepstral coefficial(MFCC)and the third layer is the same as MFCC.Retrieval rate increases firstly and decreases later with the increasing of the depth of layers,and the highest retrieval rate corresponding to depth layers is 7.Retrieval time increases non-linearly with deep layer increasing.
关 键 词:深层特征 深层说话人矢量 最小距离 说话人检索 检索率
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.116.87.126