基于DC-CNN的电子伪装语音还原研究  被引量:5

Study on Restoration of Electronic Disguised Voice Based on DC-CNN

在线阅读下载全文

作  者:王永全[1,2] 施正昱 张晓[4] WANG Yong-quan;SHI Zheng-yu;ZHANG Xiao(School of Criminal Justice,East China University of Political Science and Law,Shanghai 201620,China;Department of Information Science and Technology,East China University of Political Science and Law,Shanghai 201620,China;School of Data Science,Fudan University,Shanghai 200433,China;Key Laboratory of Information Network Security of Ministry of Public Security,The Third Research Institute of the Ministry of Public Security,Shanghai 200120,China)

机构地区:[1]华东政法大学刑事司法学院,上海201620 [2]华东政法大学信息科学与技术系,上海201620 [3]复旦大学大数据学院,上海200433 [4]公安部第三研究所信息网络安全公安部重点实验室,上海200120

出  处:《计算机科学》2019年第8期183-188,共6页Computer Science

基  金:2014年国家社会科学基金重大项目(第二批)(14ZDB147);公安部科技强警基础工作专项项目(2017GABJC33);教育部2017年第二批“云数融合科教创新”基金课题(2017B06106);华东政法大学《人工智能导论》通识重点课程建设项目(A-0312-18-174794)资助

摘  要:针对电子伪装语音还原研究在还原模型的构建方面并无突破性进展的状况,提出了一种基于扩大的因果卷积神经网络(Dilated Casual-Convolution Neural Network,DC-CNN)的电子伪装语音还原模型。该还原模型以DC-CNN为框架,对电子伪装语音历史采样点的声学信息与还原因子进行卷积和非线性映射运算。同时模型的神经网络采用跃层连接技术以优化深层传递,再经过压扩转换后输出还原语音。该模型具有非线性映射性、扩展性、多适应性与条件性、并发性等明显特点。在实验分析中,以3个基本变声功能:音调(pitch)、节拍(tempo)和速度(rate)对钢琴曲和英文语音分别进行电子伪装变声处理,再经模型还原,将还原语音与原始语音进行声纹特征比对、LPC数据分析和语音同一性的人耳测听辨识,结果表明,还原语音与原始语音的声纹特征十分吻合,且实现了高质量的共振峰波形复原,钢琴曲和英文语音的共振峰参数总体还原拟合率分别达到79.03%和79.06%,远超电子伪装语音与原始语音35%的相似比例,这说明该模型能有效削减语音中的电子伪装特征,较好地实现了电子伪装的钢琴曲和英文语音的还原。Aiming at the fact that there is no breakthrough in modeling for the electronic disguised voicer estoration,this paper proposed a new model based on Dilated Casual-Convolution Neural Network(DC-CNN)for restoring electronic disguised voice.DC-CNN is used as the framework of restoring model,and convolution and nonlinear mapping are performed on the historical sampling acoustic information and restoring factors of the electronic disguised voice.Meanwhile,the model’s neural network adopts skip-connection for deep transmission and outputs the restoring voice after companding transformation.The model has obvious characteristics such as nonlinear mapping,expansibility,adaptability and conditionality,concurrency,etc.In the experiment,the original voice was processed by three basic disguised functions:pitch,tempo and rate.Then,voiceprint features comparison,LPC analysis and voice identity of human audiometry recognition were made between restoring voice and original voice.The voiceprint of the restoring voice fits that of the original voice perfectly,and high quality formant waveform restoration is achieved.The piano music’s and English voice’s general restoring fitting rates of the formant’s parameters are 79.03%and 79.06%respectively,which are much higher than the similarity of electronic disguised voice to original voice.The results turn out that this model can minify the electronic disguised characteristics effectively and it is efficient on the restoration of electronic disguised piano music and English voice.

关 键 词:DC-CNN 电子伪装语音 还原语音 还原因子 门激活单元 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象