注意力引导局部特征联合学习的人脸表情识别  被引量:1

Attention-guided local feature joint learning for facial expression recognition

在线阅读下载全文

作  者:卢莉丹 夏海英[1] 谭玉枚 宋树祥[1] Lu Lidan;Xia Haiying;Tan Yumei;Song Shuxiang(Guangxi Key Laboratory of Brain-inspired Computing and Intelliyent Chips,School of Electronic and Information Engineering,Guangxi Normal University,Guilin 541004,China;College of Big Data and Artificial Intelligence,Nanning College ofTechnology,Nanning 530105,China;School of Computer Science and Engineering,Guangxi Normal University,Guilin 541004,China)

机构地区:[1]广西类脑计算与智能芯片重点实验室,广西师范大学电子与信息工程学院,桂林541004 [2]南宁理工学院大数据与人工智能学院,南宁530105 [3]广西师范大学计算机科学与工程学院,桂林541004

出  处:《中国图象图形学报》2024年第8期2377-2387,共11页Journal of Image and Graphics

基  金:国家自然科学基金项目(62106054);广西创新驱动重大专项项目(AA20302003);广西师范大学(自然科学类)科研项目(2021JC012);桂林市科技开发项目(20222C243986)。

摘  要:目的在复杂的自然场景下,人脸表情识别存在着眼镜、手部动作和发型等局部遮挡的问题,这些遮挡区域会降低模型的情感判别能力。因此,本文提出了一种注意力引导局部特征联合学习的人脸表情识别方法。方法该方法由全局特征提取模块、全局特征增强模块和局部特征联合学习模块组成。全局特征提取模块用于提取中间层全局特征;全局特征增强模块用于抑制人脸识别预训练模型带来的冗余特征,并增强全局人脸图像中与情感最相关的特征图语义信息;局部特征联合学习模块利用混合注意力机制来学习不同人脸局部区域的细粒度显著特征并使用联合损失进行约束。结果在2个自然场景数据集RAF-DB(real-world affective faces database)和FERPlus上进行了相关实验验证。在RAF-DB数据集中,识别准确率为89.24%,与MA-Net(global multi-scale and local attention network)相比有0.84%的性能提升;在FERPlus数据集中,识别准确率为90.04%,与FER-VT(FER framework with two attention mechanisms)的性能相当。实验结果表明该方法具有良好的鲁棒性。结论本文方法通过先全局增强后局部细化的学习顺序,有效地减少了局部遮挡问题的干扰。Objective When communicating face to face,people use various methods to convey their inner emotions,such as conversational tone,body movements,and facial expressions.Among these methods,facial expression is the most direct means of observing human emotions.People can convey their thoughts and feelings through facial expression,and they can also use it to recognize others’attitudes and inner world.Therefore,facial expression recognition belongs to one of the research directions in the field of affective computing.It can obviously be applied to many fields,such as fatigue driving detection,human–computer interaction,students’listening state analysis,and intelligent medical services.How⁃ever,in complex natural situations,facial expression recognition suffers from direct occlusion issues such as masks,sun⁃glasses,gestures,hairstyles,or beards,as well as indirect occlusion issues such as different lighting,complex back⁃grounds,and pose variation.All these concerns can pose great challenges to facial expression recognition in natural scenes,where extracting discriminative features is difficult.Thus,the final recognition results are poor.Therefore,we propose an attention-guided joint learning method for local features in facial expression recognition to reduce the interfer⁃ence of occlusion and pose variation problems.Method Our method is composed of a global feature extraction module,a global feature enhancement module,and a joint learning module for local features.First,we use ResNet-50 as the back⁃bone network and initialize the network parameters using the MS-Celeb-1M face recognition dataset.We think that the rich information available in the face recognition model can be used to complement the contextual information needed for facial expression recognition,especially the middle layer features such as eyes,nose,and mouth.Thus,the global feature extraction module is used to extract the global features of the middle layer,which consists of a 2D convolutional layer and three bottleneck residual convolu

关 键 词:人脸表情识别 注意力机制 局部遮挡 局部显著特征 联合学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象