基于Mamba-UNet架构的音高估计模型  

Pitch Estimation Model Based on Mamba-UNet Architecture

在线阅读下载全文

作  者:彭祖剑 PENG Zujian(Kaipu Cloud Information Technology Co.,Ltd.,Dongguan 523000,China)

机构地区:[1]开普云信息科技股份有限公司,广东东莞523000

出  处:《电声技术》2024年第9期50-52,56,共4页Audio Engineering

摘  要:单声源声音的音高估计算法主要有音高跟踪的鲁棒算法(Robust Algorithm for Pitch Tracking,RAPT)、SWIPE(Sawtooth Waveform Inspired Pitch Estimator)、Harvest等,但在引入有音乐伴奏等复调音乐的声源时,这些算法在人声音高估计任务中存在明显不足。借鉴现有的研究成果,改进传统声调估计的鲁棒模型(Robust Model for Vocal Pitch Estimation,RMVPE),提出一种基于Mamba-UNet架构的Mamba-RMVPE,用于解决复调音乐等多声源声音的人声音高估计问题。相较于传统的RMVPE,Mamba-RMVPE的音高准确率(Raw Pitch Accuracy,RPA)、音色准确率(Raw Chroma Accuracy,RCA)、总体正确率(Overall Accuracy,OA)均有提升,推理时间也大幅缩短。The pitch estimation algorithms for single source sound mainly include Robust Algorithm for Pitch Tracking(RAPT),Sawtooth Waveform Inspired Pitch estimator(SWIPE),Harvest,etc.However,when introducing polyphonic music sources with musical accompaniment,these algorithms have significant shortcomings in human voice high estimation tasks.Drawing on existing research results and improving traditional Robust Model for Vocal Pitch Estimation(RMVPE),a Mamba-RMVPE based on Mamba-UNet architecture is proposed to solve the problem of high estimation of human voice from multiple sound sources such as polyphonic music.Compared to traditional RMVPE,Mamba-RMVPE has improved Raw Pitch Accuracy(RPA),Raw Chroma Accuracy(RCA),and Overall Accuracy(OA),and significantly reduced inference time.

关 键 词:复调音乐 音高估计 声调估计的鲁棒模型(RMVPE) Mamba-UNet 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象