检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王学光[1] 诸珺文 张爱新[2] WANG Xueguang;ZHU Junwen;ZHANG Aixin(College of Criminal Justice,East China University of Political Science and Law,Shanghai 200042,China;School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai,200240,China)
机构地区:[1]华东政法大学刑事法学院,上海200042 [2]上海交通大学网络空间安全学院,上海200240
出 处:《计算机科学》2023年第11期177-184,共8页Computer Science
基 金:国家重点研发计划(2017YFB0802103)。
摘 要:AI克隆语音技术的出现将对现代社会法治秩序造成致命冲击。近年来研究人员仅关注了AI合成语音与样本语音内容相同领域的研究,而对AI合成语音与样本内容不同的检材的鉴定研究却甚少,相关鉴定内容无法识别。为此,提出了一种三维度基于改进MFCC特征模型对AI克隆语音源进行鉴定。首先对先前研究人员人工分析的AI克隆语音特性进行验证,总结出可识别的“共振峰F5异常活跃”与“能量、共振峰、音高曲线异常突变”的特征。其次基于AI克隆语音的特征运用二阶差分修正MFCC系数并采用“逆差逻辑推演法”将能量、共振峰、音高曲线突变特性进一步量化采样,将其定义为语音鉴定的特征向量三元组。然后以特征向量三元组为输入,运用D-S证据合成规则将三组检材与样本比对的结果融合。最后形成三维度基于改进MFCC特征参量的检材评定模型。人群随机采样实验结果表明,该AI克隆语音源鉴定方法对以同一人为克隆源所合成的AI克隆语音鉴定的平均概率为67.324%,标准差为7.32%,鉴定效果很好。The emergence of AI cloned voice technology will have a fatal impact on the legal order of modern society.In recent years,researchers have only focused on the research in the field of AI-synthesized speech containing the same sample speech content,but little research has been done on the identification of AI-synthesized speech containing the content that is different from the sample content.Thus,this paper proposes a three-dimensional model to identify AI cloned speech sources based on an improved MFCC feature model.Firstly,it verifies the characteristics of artificially analyzed AI cloned speech by previous scholars,and summarize the characteristics of“abnormally active formant F5”and“abnormal mutation of energy,formant and pitch curve”for computer identification.Secondly,it uses the second-order difference to correct the MFCC coefficients based on the characte-ristics of AI cloned speech,and use the“inverse logic deduction method”to further quantify and sample the mutation characteristics of energy,formants,and pitch curves,and define them as feature vector ternary of speech recognition.After that,it takes the feature vector triples as input,and uses the D-S evidence synthesis rule to fuse the results of the comparison of the three groups of inspection materials with the samples.Finally,a three-dimensional material evaluation model based on improved MFCC characteristic parameters is formed.After the random sampling experiment of the crowd,the AI clone source identification method has an average probability of 67.324%with a standard deviation of 7.32%for the identification of AI clones synthesized with the same human clone source,which is very effective.
关 键 词:AI克隆语音 MFCC特征 三维度语音建模 语音源鉴定
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.113.158