检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Honghong YANG Hongxi LIU Yumei ZHANG Xiaojun WU
机构地区:[1]Key Laboratory of Modern Teaching Technology,Ministry of Education,Shaaxi Normal University,Xian710062,China [2]Key Laboratory of Intelligent Computing and Service Technology for Folk Song,Ministry of Culture and Tourism,Xian 710062,China [3]School of Computer Science,Shaanxi Normal University,Xi'an 710062,China
出 处:《Chinese Journal of Electronics》2024年第6期1346-1359,共14页电子学报(英文版)
基 金:supported by the National Natural Science Foundation of China(Grant Nos.61907028,62107027,and 11872036);the Young Science and Technology Stars in Shaanxi Province(Grant No.2021KJXX-91);the Central Universities(Grant Nos.2023YBGY158,K2021011004,2022TD-26,and GK202205020)。
摘 要:Graph convolutional networks that leverage spatial-temporal information from skeletal data have emerged as a popular approach for 3D human pose estimation.However,comprehensively modeling consistent spatialtemporal dependencies among the body joints remains a challenging task.Current approaches are limited by performing graph convolutions solely on immediate neighbors,deploying separate spatial or temporal modules,and utilizing single-pass feedforward architectures.To solve these limitations,we propose a forward multi-scale residual graph convolutional network(FMR-GNet)for 3D pose estimation from monocular video.First,we introduce a mix-hop spatialtemporal attention graph convolution layer that effectively aggregates neighboring features with learnable weights over large receptive fields.The attention mechanism enables dynamically computing edge weights at each layer.Second,we devise a cross-domain spatial-temporal residual module to fuse multi-scale spatial-temporal convolutional features through residual connections,explicitly modeling interdependencies across spatial and temporal domains.Third,we integrate a forward dense connection block to propagate spatial-temporal representations across network layers,enabling high-level semantic skeleton information to enrich lower-level features.Comprehensive experiments conducted on two challenging 3D human pose estimation benchmarks,namely Human3.6M and MPI-INF-3DHP,demonstrate that the proposed FMR-GNet achieves superior performance,surpassing the most state-of-the-art methods.
关 键 词:3D human pose estimation Spatial-temporal graph convolution network Cross-domain residual connection
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.48.106