融合空时特征的动态人脸表情识别

Dynamic facial expression recognition integrating spatiotemporal features

作　　者：刘宝宝陶露杨菁菁王贺应 LIU Baobao;TAO Lu;YANG Jingjing;WANG Heying(School of Computer Science,Xi’an Polytechnic University,Xi’an 710048,China)

机构地区：[1]西安工程大学计算机科学学院,陕西西安710048

出　　处：《西安工程大学学报》2024年第6期105-113,共9页Journal of Xi’an Polytechnic University

基　　金：陕西省教育厅信息保障专项科学研究计划项目(20JX004);陕西省自然科学基础研究计划一般项目(面上)(2020JM-574)。

摘　　要：针对自然环境中面部关键特征提取困难及表情动态变化难以捕捉的问题,提出一种基于关键帧的TDRAG(three-dimensional resnet and attention mechanism with GRU)网络,该网络能够有效挖掘视频序列的时空信息。首先,应用冗余系数筛选关键帧,减少重复信息的干扰。其次,设计三维残差注意力块,用于提升对表情序列关键空间区域的聚焦能力,使网络能够学习含有遮挡、姿势变化的鲁棒面部特征。最后,利用门控循环单元(gate recurrent unit,GRU)增强模型对时间维度变化的敏感性和解析能力,促进网络对表情序列动态演变的深入理解。实验结果表明:与基准模型3DResNet18相比,TDRAG网络在DFEW数据集上加权的平均召回率(weighted average recall,WAR)和非加权的平均召回率(unweighted average recall,UAR)分别提升了4.27%和4.16%,验证了TDRAG网络在提取人脸关键特征以及提升动态人脸表情识别精度的有效性。To address the challenges of extracting key facial features and capturing the dynamic changes of expressions in natural environments,a network model based on keyframes,named three-dimensional resnet and attention mechanism with GRU(TDRAG)was proposed.The network was capable of effectively mining the spatiotemporal information of video sequences.Firstly,it employed redundancy coefficients to select keyframes for reducing the interference of repetitive information.Secondly,three-dimensional residual attention blocks were designed to enhance the network's focus on key spatial areas of expression sequences,enabling the learning of robust facial features with occlusions and pose variations.Lastly,gate recurrent unit(GRU)unit was utilized to heighten the model's sensitivity and interpretative ability regarding temporal dimension changes,fostering a deeper understanding of the dynamic evolution of expression sequences.Experimental results demonstrate that the TDRAG model shows improvements of 4.27%in weighted average recall(WAR)and 4.16%in unweighted average recall(UAR)on the DFEW dataset,compared to the baseline model 3DResNet18,validating the effectiveness of TDRAG in extracting key facial features and enhancing the accuracy of dynamic facial expression recognition.

关键词：动态表情识别三维卷积网络关键帧三维注意力门控循环单元

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合空时特征的动态人脸表情识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合空时特征的动态人脸表情识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索