检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李松洋 王雪婷 陈相龙 陈恩庆[1] LI Songyang;WANG Xueting;CHEN Xiangong;CHEN Enqing(School of Electrical and Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China)
机构地区:[1]郑州大学电气与信息工程学院,河南郑州450001
出 处:《图学学报》2024年第4期760-769,共10页Journal of Graphics
基 金:国家自然科学基金项目(62101503,U1804152);河南省科技攻关项目(222102210102);国家超级计算郑州中心支持项目。
摘 要:人体动作识别是计算机视觉的重要研究方向,广泛应用于智能监控、人机交互等领域。现有基于骨骼点的动作识别方法多采用图卷积网络(GCN)和时间卷积网络(TCN)级联的方式实现,而后者卷积核的尺寸限制了模型的全局时间建模能力。此外,仅使用卷积处理骨骼点数据缺乏对于不同骨骼点的区分能力,并且TCN提取特征时往往会重复计算,使得TCN的参数量随着网络层数的加深而增大。借助信号处理的方法提出了一种适用于骨骼点的动态时域滤波模块(SDTF),用于代替TCN对时间特征进行全局建模,并在此基础上对AGCN进行轻量化改进,提出的AGCN-SDTF动作识别模型降低了模型复杂度。SDTF通过傅里叶变换对时间特征进行建模,将傅里叶变换得到的频域特征与滤波得到的频域输出相乘再经过傅里叶逆变换,从而实现对全局时间特征的提取。在NTU-RGBD和Kinetics-Skeleton大型数据集上的实验结果表明,该模型在达到与原模型相同的识别效果时,降低了模型所需的参数量和计算量。Human action recognition is one of the key research areas in computer vision,with a wide range of applications such as human-computer interaction and intelligent surveillance.Existing methods for skeleton-based action recognition often combine graph convolutional networks(GCN)with temporal convolutional networks(TCN).However,the limited size of convolutional kernel restricts the models’global temporal modeling capability.Moreover,applying convolutional kernel to skeletal data leads to a lack of discriminative power among different skeleton points.Furthermore,using TCN to extract features often entails repeated calculations,leading to an increase in the parameter quantity of TCN as the network deepens.To address these issues,signal processing methods were utilized,and skeleton dynamic temporal filtering(SDTF)module was proposed for skeleton action recognition to replace TCN for global modeling.Based on this,lightweight improvements were made to AGCN,reducing the complexity.SDTF modeled temporal features through Fourier transform,multiplying the frequency domain features obtained from Fourier transform with the filtered frequency domain output,and then undergoing inverse Fourier transform.Extensive experiments conducted on the NTU-RGBD and Kinetics-Skeleton datasets demonstrated that the proposed model significantly reduced network parameters and computational complexity,while achieving comparable or even superior recognition performance compared to the original model.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.14.247.147