基于点积自注意力卷积神经网络的歌声检测  被引量:1

Singing Voice Detection Algorithm Based on a Scaled Dot-product Attention Embedded Convolutional Neural Network

在线阅读下载全文

作  者:桂文明[1,2] 曾岳 臧娴[3] GUI Wenming;ZENG Yue;ZANG Xian(School of Software Engineering,Jinling Institute of Technology,Nanjing,Jiangsu 211169,China;Key Lab of Broadband Wireless Communication and Sensor Network Technology(Nanjing University of Posts and Telecommunications),Ministry of Education,Nanjing,Jiangsu 210003,China;School of Electronic and Information Engineering,Jinling Institute of Technology,Nanjing,Jiangsu 211169,China)

机构地区:[1]金陵科技学院软件工程学院,江苏南京211169 [2]南京邮电大学宽带无线通信与传感网技术教育部重点实验室,江苏南京210003 [3]金陵科技学院电子信息工程学院,江苏南京211169

出  处:《信号处理》2021年第10期1899-1906,共8页Journal of Signal Processing

基  金:国家自然科学基金(61872199);南京邮电大学宽带无线通信与传感网技术教育部重点实验室开放研究基金资助(201908);江苏省教育厅高校优秀中青年教师和校长境外研修项目(2018191)。

摘  要:传统的歌声检测过程往往包含了复杂的特征工程,而基于深度神经网络统一框架的算法则可以利用其强大的学习能力学习到特征,从而忽略特征工程。但是,这些学习到的特征通常得不到重要性区分,在网络中所占权重相同。针对这一问题,提出在卷积神经网络中嵌入点积自注意力模块的算法,该算法通过学习得到各个特征的注意力分布,调整注意力权重,使得卷积神经元在"观察"这些特征时能区分轻重,从而提升网络的整体性能。在实验部分,通过在两个公开数据集下测试,并和基准模型进行对比,准确率分别提升1.96%和1.76%,证明了该算法对提升歌声检测水平切实有效。The complicated feature engineering usually plays a significantly important role in the conventional singing voice detection algorithm,while it could be neglected in those algorithms based on the deep neural network because they can learn the features through their strong learning capability.However,the learned features are treated equally in the network despite their different importance for the result.To address this problem,a scaled dot-product attention embedded convolutional neural network was proposed,in which attention distribution for the feature maps was achieved by learning,and then the weights of the feature maps were adjusted so that the convolutional neurons could distinctively"observe"the features in terms of importance,resulting in the overall performance improvements.In the experimental section,compared to the base line model,with the experiments on the two public datasets,the accuracies outperformed the base line by 1.96%and 1.76%respectively.The results proved the effectiveness of this algorithm.

关 键 词:歌声检测 卷积神经网络 余弦注意力 点积自注意力 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象