一种融合注意力机制与3D双流卷积网络的动态头势识别方法  

A Dynamic Head Gesture Recognition Method that Fuses Attention Mechanism with 3D Two-Stream Convolution

在线阅读下载全文

作  者:张波涛[1] 朱鑫悦 谢佳龙 吕强[1] ZHANG Botao;ZHU Xinyue;XIE Jialong;LU Qiang(School of Automation,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China;School of Control Science and Engineering,Shandong University,Jinan Shandong 250061,China)

机构地区:[1]杭州电子科技大学自动化学院,浙江杭州310018 [2]山东大学控制科学与工程学院,山东济南250061

出  处:《传感技术学报》2024年第10期1734-1745,共12页Chinese Journal of Sensors and Actuators

基  金:浙江省重点研发计划项目(2019C04018);浙江省自然科学基金重点项目(LZ23F030004);浙江省属高校基本科研业务费专项资金(GK229909299001-004);国家自然科学基金(62073108)。

摘  要:头势能够传递丰富的情绪和意图信息,属于重要的人机交互方式之一。然而,目前基于穿戴式传感器的头势识别方法虽然具有较高的识别率,但缺乏经济性和便捷性,而基于机器视觉的方法普遍存在准确率低、泛化性差、算力成本较高的问题,因此目前的头势识别方法仍难以部署于移动机器人。针对以上问题,提出了一种融合注意力机制与3D双流卷积的动态头势识别方法。该方法从动态头势视频帧中提取RGB信号和光流特征,在注意力机制的启发下,从通道域和空间域进行动作特征提取和增强,从而对关键特征进行准确提取,然后对特征进行融合与分类。实验结果表明,所提方法能够有效提取头势中关键的通道域和空间域信息,可显著提高头势识别的准确率及泛化能力,可在有限算力下实现较高的准确率与实时性。其后,将所提方法应用于助老机器人,在实际示范应用中进行了验证,结果表明本方法适于移动机器人等算力受限的移动机载计算平台。Head gesture is a crucial human-computer interaction approach that usually conveys important emotional and intentional infor mation.Most wearable device-based methods are expensive and inconvenient,although most of them have satisfactory accuracy.Mean while,vision-based methods suffer from low accuracy,insufficient generalization,and enormous computational cost.Therefore,the current head recognition methods are still difficult to apply to mobile robots.A dynamic head gesture recognition method that fuses attention mechanism with 3D two-stream convolution(Fam-3DTSC)is proposed to deal with the above problems.Fam-3DTSC extracts RGB signal and optical flow features from videos with dynamic head gestures,and makes action feature extraction and strengthening from the channel and spatial domain,which is inspired by the attention mechanism.After the critical features have been extracted effectively and accurately,the features are fused and classified.The experimental results show that the proposed method can extract the essential chan nel domain and spatial domain information of head gestures,improve the accuracy and generalization ability of head gesture recognition.The proposed method can also achieve high accuracy and real-time performance with limited computing resources.Later,it is applied to the elderly assistance robot and validated in a practical demonstration application.The results show that the proposed method is suitable for mobile on-board computing platforms with limited computing resources,such as mobile robots.

关 键 词:移动机器人 人机交互 注意力机制 动态头势 动作识别 

分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象