检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:徐正丽 肖素芳 简敏 杨明浩[2] XU Zhengli;XIAO Sufang;JIAN Min;YANG Minghao(Guilin University of Electronic Technology,Guilin,Guangxi,541004,China;Institute of Automation of the Chinese Academy of Sciences,Beijing,100190,China)
机构地区:[1]桂林电子科技大学,广西桂林541004 [2]中国科学院自动化研究所,北京100190
出 处:《广西科学》2023年第4期745-753,共9页Guangxi Sciences
基 金:国家自然科学基金项目(71463010,22180155466);广西科技计划项目(2021GXNSFBA220048,桂科AB21220038);桂林科技计划项目(2023010123)资助。
摘 要:舌头是人类重要的发音器官,对发音时其形状的降维分析能有效协助语言学家分析人类的发音模式。主成分分析(Principal Component Analysis, PCA)是目前最常用的舌位轮廓降维分析方法。近年来,基于深度学习的自动编码器在降维方面被证明优于PCA。然而,舌头隐藏于口腔内部,难以获得大量的相关数据,这使得传统自动编码器无法直接用于舌位轮廓建模研究。为此,本文提出一种面向小样本舌位运动轮廓数据的双阶段自动编码器降维方法。首先该方法采用主动形状模型(Active Shape Model, ASM)产生大量舌头轮廓生理变形数据,并构建通用轮廓重建模型;接着,在第一阶段模型上添加降维层,用于对舌位轮廓数据进行压缩和分析。实验选取了从人类发音X光片中获得的240个元音舌形数据,并将该方法与传统PCA方法进行比较。结果表明,所提出方法获得的元音舌位图谱在二维平面上相对于传统PCA方法,区分度更好,具有更好的舌形降维和重建能力。The tongue plays a crucial role in human speech production.The dimensionality reduction analysis of tongue pronunciation can effectively assist linguists in analyzing human pronunciation patterns.Traditional methods for tongue position contour compression often relay on Principal Component Analysis(PCA)for dimensionality reduction.In recent years,deep-learning-based autoencoders have been widely used for data compression.However,they require a large number of samples and cannot be directly and effectively used for tongue motion pattern researches.Besides,obtaining a substantial volume of tongue movement data has been challenging due to the tongue's location within the oral cavity.To address these limitations,this paper introduces a two-stage autoencoder dimensionality reduction method designed for small-sample tongue motion contour data.Firstly,Active Shape Model(ASM)is used to generate a large amount of physiological deformation data of tongue contour,and a general tongue contour reconstruction model is constructed based on a conventional automatic encoder.Secondly,on the basis of the automatic encoder in the previous stage,an additional network layer is added to compress and analyze the tongue position data.In experiments,240 vowel and tongue shape datasets obtained from X-ray films of human speech are selected.The tongue position model and traditional PCA methods were compared.The results show that the vowel tongue position map obtained by the proposed method exhibits better discrimination on the two dimensional plane,and has better tongue shape reconstruction performance.
关 键 词:深度神经网络 自动编码器 主成分分析 舌位轮廓 隐藏单元
分 类 号:TP389[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38