检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:姚文翰 柯登峰 黄良杰 胡睿欣 项敏特 张劲松[1] YAO Wenhan;KE Dengfeng;HUANG Liangjie;HU Ruixin;XIANG Minte;ZHANG Jinsong(Department of Information Science,Beijing Language and Culture University,Beijing 100089,China)
出 处:《郑州大学学报(理学版)》2023年第5期67-72,共6页Journal of Zhengzhou University:Natural Science Edition
基 金:汉考国际科研基金项目(HT-202011-374)。
摘 要:语音情感转换是在不改变话者声纹、语义的情况下,将一种情感语音转换成另一种情感语音的技术,本质是实现语音的风格迁移。主流的风格迁移技术有对抗生成技术(如CycleGAN,StarGAN)和实例规一化技术(如IN,CIN)。CIN相对于IN添加了均值方差选择性模块,具有更强的风格迁移能力。提出了将StarGAN和CIN结合的语音情感转换模型CIN-StarGAN,将CIN模块嵌入到StarGAN生成器。在ESD数据集上的实验结果表明,CINStarGAN比基于CycleGAN的情感转换模型收敛速度快28%,具有较好的风格转换能力。在多领域情感转换方法上具有潜在研究价值。Emotional voice conversion was a technology that converted the emotion of a speech into another without changing the speaker′s timbre and semantics.Its essence was to transfer style of speech.The mainstream style transfer technologies included generative adversarial network(such as CycleGAN,Star-GAN)and instance normalization technology(such as IN,CIN).Compared with IN,CIN added a mean variance selective module,which had stronger style transfer ability.StarGAN and CIN were combined,and proposed a new speech emotion conversion model,CIN-StarGAN.The model embeded the CIN module into the StarGAN generator.The experimental results on ESD data sets showed that CIN-StarGAN converged 28%faster than CycleGAN based emotion conversion model,and had better style transfer ability.It had potential research value in multi domain emotion transfer methods.
关 键 词:语音情感转换 域转换 条件实例归一化 生成对抗网络
分 类 号:TN912.3[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7