FMR-GNet:Forward Mix-Hop Spatial-Temporal Residual Graph Network for 3D Pose Estimation  

在线阅读下载全文

作  者:Honghong YANG Hongxi LIU Yumei ZHANG Xiaojun WU 

机构地区:[1]Key Laboratory of Modern Teaching Technology,Ministry of Education,Shaaxi Normal University,Xian710062,China [2]Key Laboratory of Intelligent Computing and Service Technology for Folk Song,Ministry of Culture and Tourism,Xian 710062,China [3]School of Computer Science,Shaanxi Normal University,Xi'an 710062,China

出  处:《Chinese Journal of Electronics》2024年第6期1346-1359,共14页电子学报(英文版)

基  金:supported by the National Natural Science Foundation of China(Grant Nos.61907028,62107027,and 11872036);the Young Science and Technology Stars in Shaanxi Province(Grant No.2021KJXX-91);the Central Universities(Grant Nos.2023YBGY158,K2021011004,2022TD-26,and GK202205020)。

摘  要:Graph convolutional networks that leverage spatial-temporal information from skeletal data have emerged as a popular approach for 3D human pose estimation.However,comprehensively modeling consistent spatialtemporal dependencies among the body joints remains a challenging task.Current approaches are limited by performing graph convolutions solely on immediate neighbors,deploying separate spatial or temporal modules,and utilizing single-pass feedforward architectures.To solve these limitations,we propose a forward multi-scale residual graph convolutional network(FMR-GNet)for 3D pose estimation from monocular video.First,we introduce a mix-hop spatialtemporal attention graph convolution layer that effectively aggregates neighboring features with learnable weights over large receptive fields.The attention mechanism enables dynamically computing edge weights at each layer.Second,we devise a cross-domain spatial-temporal residual module to fuse multi-scale spatial-temporal convolutional features through residual connections,explicitly modeling interdependencies across spatial and temporal domains.Third,we integrate a forward dense connection block to propagate spatial-temporal representations across network layers,enabling high-level semantic skeleton information to enrich lower-level features.Comprehensive experiments conducted on two challenging 3D human pose estimation benchmarks,namely Human3.6M and MPI-INF-3DHP,demonstrate that the proposed FMR-GNet achieves superior performance,surpassing the most state-of-the-art methods.

关 键 词:3D human pose estimation Spatial-temporal graph convolution network Cross-domain residual connection 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象