基于骨骼点动态时域滤波的人体动作识别  

Human action recognition based on skeleton dynamic temporal filter

在线阅读下载全文

作  者:李松洋 王雪婷 陈相龙 陈恩庆[1] LI Songyang;WANG Xueting;CHEN Xiangong;CHEN Enqing(School of Electrical and Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China)

机构地区:[1]郑州大学电气与信息工程学院,河南郑州450001

出  处:《图学学报》2024年第4期760-769,共10页Journal of Graphics

基  金:国家自然科学基金项目(62101503,U1804152);河南省科技攻关项目(222102210102);国家超级计算郑州中心支持项目。

摘  要:人体动作识别是计算机视觉的重要研究方向,广泛应用于智能监控、人机交互等领域。现有基于骨骼点的动作识别方法多采用图卷积网络(GCN)和时间卷积网络(TCN)级联的方式实现,而后者卷积核的尺寸限制了模型的全局时间建模能力。此外,仅使用卷积处理骨骼点数据缺乏对于不同骨骼点的区分能力,并且TCN提取特征时往往会重复计算,使得TCN的参数量随着网络层数的加深而增大。借助信号处理的方法提出了一种适用于骨骼点的动态时域滤波模块(SDTF),用于代替TCN对时间特征进行全局建模,并在此基础上对AGCN进行轻量化改进,提出的AGCN-SDTF动作识别模型降低了模型复杂度。SDTF通过傅里叶变换对时间特征进行建模,将傅里叶变换得到的频域特征与滤波得到的频域输出相乘再经过傅里叶逆变换,从而实现对全局时间特征的提取。在NTU-RGBD和Kinetics-Skeleton大型数据集上的实验结果表明,该模型在达到与原模型相同的识别效果时,降低了模型所需的参数量和计算量。Human action recognition is one of the key research areas in computer vision,with a wide range of applications such as human-computer interaction and intelligent surveillance.Existing methods for skeleton-based action recognition often combine graph convolutional networks(GCN)with temporal convolutional networks(TCN).However,the limited size of convolutional kernel restricts the models’global temporal modeling capability.Moreover,applying convolutional kernel to skeletal data leads to a lack of discriminative power among different skeleton points.Furthermore,using TCN to extract features often entails repeated calculations,leading to an increase in the parameter quantity of TCN as the network deepens.To address these issues,signal processing methods were utilized,and skeleton dynamic temporal filtering(SDTF)module was proposed for skeleton action recognition to replace TCN for global modeling.Based on this,lightweight improvements were made to AGCN,reducing the complexity.SDTF modeled temporal features through Fourier transform,multiplying the frequency domain features obtained from Fourier transform with the filtered frequency domain output,and then undergoing inverse Fourier transform.Extensive experiments conducted on the NTU-RGBD and Kinetics-Skeleton datasets demonstrated that the proposed model significantly reduced network parameters and computational complexity,while achieving comparable or even superior recognition performance compared to the original model.

关 键 词:人体动作识别 图卷积网络 动态时域滤波 傅里叶变换 时间卷积网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] TP183[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象