机构地区:[1]中国科学院计算技术研究所智能信息处理重点实验室,北京100190 [2]中国科学院大学,北京100049
出 处:《中国图象图形学报》2021年第12期2879-2891,共13页Journal of Image and Graphics
基 金:国家重点研发计划资助(2019YFF0301801,2017YFB1002703);国家重点基础研究规划项目(2015CB554507);国家自然科学基金项目(61379082)。
摘 要:目的人体骨架的动态变化对于动作识别具有重要意义。从关节轨迹的角度出发,部分对动作类别判定具有价值的关节轨迹传达了最重要的信息。在同一动作的每次尝试中,相应关节的轨迹一般具有相似的基本形状,但其具体形式会受到一定的畸变影响。基于对畸变因素的分析,将人体运动中关节轨迹的常见变换建模为时空双仿射变换。方法首先用一个统一的表达式以内外变换的形式将时空双仿射变换进行描述。基于变换前后轨迹曲线的微分关系推导设计了双仿射微分不变量,用于描述关节轨迹的局部属性。基于微分不变量和关节坐标在数据结构上的同构特点,提出了一种通道增强方法,使用微分不变量将输入数据沿通道维度扩展后,输入神经网络进行训练与评估,用于提高神经网络的泛化能力。结果实验在两个大型动作识别数据集NTU(Nanyang Technological University)RGB+D(NTU 60)和NTU RGB+D 120(NTU 120)上与若干最新方法及两种基线方法进行比较,在两种实验设置(跨参与者识别与跨视角识别)中均取得了明显的改进结果。相比于使用原始数据的时空图神经卷积网络(spatio-temporal graph convolutional networks,ST-GCN),在NTU 60数据集中,跨参与者与跨视角的识别准确率分别提高了1.9%和3.0%;在NTU 120数据集中,跨参与者与跨环境的识别准确率分别提高了5.6%和4.5%。同时对比于数据增强,基于不变特征的通道增强方法在两种实验设置下都能有明显改善,更为有效地提升了网络的泛化能力。结论本文提出的不变特征与通道增强,直观有效地综合了传统特征和深度学习的优点,有效提高了骨架动作识别的准确性,改善了神经网络的泛化能力。Objective Skeleton-based action recognition has been concerned in recent years,as the dynamics of human skeletons has significant information for the task of action recognition.The action of human skeletons can be seen as time series of human poses,or the combination of human joint trajectories.The trajectory of important joints indicating the action class has conveyed the most significant information among all the human joints.The trajectories of these joints have been subjected to some distortions when performing the same action under different attempts.In this case,two similar trajectories of corresponding joints should share a basic shape.However,these two trajectories have appeared in diverse kinds of distortions due to individual factors.These distortions have been caused by spatial and temporal factors.Spatial factors have included the change of viewpoints,different skeleton sizes and action amplitudes,while temporal factors indicate time scaling along the time series,denoting the order and speed of performing specific action.All the spatial factors can be modeled by the affine transformation in 3 D space,whereas the uniform time scaling has been commonly discussed case,which can be seen as affine transformation in 1 D space.These two kinds of distortions as the spatio-temporal dual affine transformation have been combined.A novel invariant feature under these distortions has been proposed and utilized for facilitating skeleton-based action recognition.A kind of feature invariant based on the spatio-temporal affine transformation has aided the identification of similar trajectories to be beneficial for action recognition.Method A general method for constructing spatiotemporal dual affine differential invariant(STDADI)has been proposed.The rational polynomial of derivatives of joint trajectories to obtain the invariants has been utilized in detail via eliminating the transformation parameters effectively.Robust,coordinate-system-independent feature has calculated directly from the 3 D coordinates.Bounding t
关 键 词:运动分析 骨架动作识别 时空双仿射变换 微分不变量 通道增强 泛化能力
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...