语音声特征提取的总变分正则化流形学习方法  

Total Variation Regularization Manifold Learning Method for Speech Feature Extraction

作  者:张开业 赵化良 刘志红 徐希鑫[1,2] 李建华 ZHANG Kaiye;ZHAO Hualiang;LIU Zhihong;XU Xixin;LI Jianhua(College of Mechanical and Automotive Engineering,Qingdao University of Technology,Qingdao 266520,Shandong,China;Key Laboratory of Industrial Fluid Energy-saving and Pollution Control,State Ministry of Education,Qingdao University of Technology,Qingdao 266520,Shandong,China)

机构地区:[1]青岛理工大学机械与汽车工程学院,山东青岛266520 [2]青岛理工大学工业流体节能与污染控制教育部重点实验室,山东青岛266520

出  处:《噪声与振动控制》2025年第2期97-104,共8页Noise and Vibration Control

基  金:山东省自然科学基金资助项目(ZR2023MF018);国家自然科学基金资助项目(61871447)。

摘  要:语音声信号具有显著的时频稀疏性、时变性和高维非线性,为具体表征和有效提取其声特征,提出一种总变分正则化流形学习方法。以局部线性嵌入算法为基础,对预处理后的语音声信号先后进行二次傅里叶变换,再经统计分析提取长时幅值特征,构造包含短时和长时幅值特征的声特征向量,生成高维特征矩阵;在利用总变分对其k邻域进行优化,最后构造基于权重值能量最小化约束的总变分正则化流形学习声特征提取数学模型,经凸优化得出最优权重,解析语音声特征的低维流形。经分析与方法对比,该方法不仅可以明确声特征流形的物理意义,避免流形的扭曲变形,而且还能大幅降低数值计算量,提升计算速度,为智能语音的机器学习和模式识别提供方法技术支持。The speech signal has significant time-frequency sparsity,time-variability and high-dimensional nonlinearity.In order to characterize and extract its acoustic features effectively,a total variational regularized manifold learning method was proposed.Based on the local linear embedding algorithm,the quadratic Fourier transform was performed successively for pre-processed speech signals.Then the long-term amplitude features were extracted through statistical analysis,and the sound feature vectors containing the short and long amplitude features were constructed to generate the high-dimensional feature matrix.Finally,a mathematical model of learning sound feature extraction based on the weight value energy minimization constraint of the total variation regularization manifold was constructed.The optimal weight was obtained by convex optimization,and the low-dimensional manifold of speech sound features was analyzed.Through the analysis and comparison of the methods,it was concluded that the proposed method not only defines the physical meaning of acoustic characteristic manifolds and avoids the distortion of manifolds,but also greatly reduces the amount of numerical calculation and improves the calculation speed,which provides a technical support for machine learning and pattern recognition of intelligent speech.

关 键 词:声学 语音声信号 正则化流形 总变分 高维特征矩阵 k邻域 声特征提取 

分 类 号:TN911.7[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象