基于SIFT的说话人唇动识别  被引量:2

Lip motion recognition of speaker based on SIFT

在线阅读下载全文

作  者:马新军[1] 吴晨晨[1] 仲乾元 李园园[1] 

机构地区:[1]哈尔滨工业大学(深圳)机电工程与自动化学院,广东深圳518055

出  处:《计算机应用》2017年第9期2694-2699,共6页journal of Computer Applications

基  金:国家自然科学基金资助项目(51677035);深圳市基础研究项目(JCYJ20150513151706580);深圳市科技计划项目(GRCK2016082611021550)~~

摘  要:针对唇部特征提取维度过高以及对尺度空间敏感的问题,提出了一种基于尺度不变特征变换(SIFT)算法作特征提取来进行说话人身份认证的技术。首先,提出了一种简单的视频帧图片规整算法,将不同长度的唇动视频规整到同一的长度,提取出具有代表性的唇动图片;然后,提出一种在SIFT关键点的基础上,进行纹理和运动特征的提取算法,并经过主成分分析(PCA)算法的整合,最终得到具有代表性的唇动特征进行认证;最后,根据所得到的特征,提出了一种简单的分类算法。实验结果显示,和常见的局部二元模式(LBP)特征和方向梯度直方图(HOG)特征相比较,该特征提取算法的错误接受率(FAR)和错误拒绝率(FRR)表现更佳。说明整个说话人唇动特征识别算法是有效的,能够得到较为理想的结果。Aiming at the problem that the lip feature dimension is too high and sensitive to the scale space, a technique based on the Scale-Invariant Feature Transform (SIFT) algorithm was proposed to carry out the speaker authentication. Firstly, a simple video frame image neat algorithm was proposed to adjust the length of the lip video to the same length, and the representative lip motion pictures were extracted. Then, a new algorithm based on key points of SIFT was proposed to extract the texture and motion features. After the integration of Principal Component Analysis (PCA) algorithm, the typical lip motion features were obtained for authentication. Finally, a simple classification algorithm was presented according to the obtained features. The experimental results show that compared to the common Local Binary Pattern (LBP) feature and the Histogram of Oriental Gradient (HOG) feature, the False Acceptance Rate (FAR) and False Rejection Rate (FRR) of the proposed feature extraction algorithm are better, which proves that the whole speaker lip motion recognition algorithm is effective and can get the ideal results.

关 键 词:唇部特征 尺度不变特征变换 特征提取 说话人识别 

分 类 号:TP391.72[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象