多模深度卷积神经网络应用于视频表情识别被引量：19

Video-based facial expression recognition using multimodal deep convolutional neural networks

作　　者：潘仙张[1] 张石清[1] 郭文平[1] PAN Xian-zhang;ZHANG Shi-qing;GUO Wen-ping(Institute of Intelligent Information Processing, Taizhou college, Taizhou 318000, China)

机构地区：[1]台州学院智能信息处理研究所,浙江台州318000

出　　处：《光学精密工程》2019年第4期963-970,共8页Optics and Precision Engineering

基　　金：浙江省公益技术研究计划基金资助项目(No.LGF19F020009);浙江省自然科学基金资助项目(No.LY14F020036;No.LY16F020011);国家自然科学基金资助项目(No.61203257)

摘　　要：由于视频中的手工特征和主观情感之间的直接相关性很小,识别视频序列中的面部表情是一项很有挑战性的任务,为了克服这个缺陷,有效提高视频中的人脸表情识别性能。本方法采用两个深度卷积神经网络,即空间卷积神经网络和时间卷积神经网络,用于视频中的时空表情特征学习。其中,空间卷积神经网络用于提取视频中每一帧静态的表情图像的空间信息特征,而时间卷积神经网络用于从视频中多帧表情图像的光流信息中提取动态信息特征。然后,将这两个深度卷积神经网络学习到的时空特征进行基于深度信念网络(DBN)的特征层融合,输入到支持向量机实现视频中的人脸表情分类任务。在公共的RML和BAUM-1s视频情感数据集的测试结果表明,该方法分别取得了71.06%和52.18%的正确识别率,明显优于现有文献报导的结果。多模深度卷积神经网络的人脸表情识别方法能提高视频中人脸表情的识别性能。Recognizing facial expressions in video sequences is challenging because of the difficulty in distinguishing between hand-crafted features and subjective emotions. To solve this problem, we aim to improve the performance of facial expression recognition in videos. Our method used two deep convolutional neural networks (DCNNs)(i.e., spatial and temporal convolutional neural networks) to learn the temporal-spatial expression features in videos. The spatial convolutional neural network was used to extract the spatial features of static expression images from each video frame, where as the temporal convolutional neural network was used to extract dynamic features from optical flow information hidden in multi-frame expression images of a video. The temporal-spatial features were then fused using a deep belief network. Finally, support vector machines were employed to perform facial expression classification. Based on experimental results on public RML and BAUM-1s video-based emotional datasets, our method achieved an accuracy of 71.06% and 52.18%, respectively, which is clearly better than the results of existing studies. This study thus showed that our multimodal DCNN can improve the performance of facial expression recognition in videos.

关键词：深度卷积神经网络多模深度学习表情识别时空特征深度信念神经网络

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

多模深度卷积神经网络应用于视频表情识别被引量：19

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

多模深度卷积神经网络应用于视频表情识别 被引量：19

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

多模深度卷积神经网络应用于视频表情识别被引量：19