Explore human parsing modality for action recognition  

在线阅读下载全文

作  者:Jinfu Liu Runwei Ding Yuhang Wen Nan Dai Fanyang Meng Fang-Lue Zhang Shen Zhao Mengyuan Liu 

机构地区:[1]School of Intelligent Systems Engineering,Sun Yat-sen University,Shenzhen,China [2]Peng Cheng Laboratory,Shenzhen,China [3]State Key Laboratory of General Artificial Intelligence,Peking University,Shenzhen Graduate School,Shenzhen,China [4]Changchun University of Science and Technology,Changchun,China [5]Victoria University of Wellington,Wellington,New Zealand

出  处:《CAAI Transactions on Intelligence Technology》2024年第6期1623-1633,共11页智能技术学报(英文)

基  金:National Natural Science Foundation of China,Grant/Award Number:62203476;Natural Science Foundation of Guangdong Province,Grant/Award Number:2024A1515012089;Natural Science Foundation of Shenzhen,Grant/Award Number:JCYJ20230807120801002;Shenzhen Innovation in Science and Technology Foundation for The Excellent Youth Scholars,Grant/Award Number:RCYX20231211090248064。

摘  要:Multimodal-based action recognition methods have achieved high success using pose and RGB modality.However,skeletons sequences lack appearance depiction and RGB images suffer irrelevant noise due to modality limitations.To address this,the authors introduce human parsing feature map as a novel modality,since it can selectively retain effective semantic features of the body parts while filtering out most irrelevant noise.The authors propose a new dual-branch framework called ensemble human parsing and pose network(EPP-Net),which is the first to leverage both skeletons and human parsing modalities for action recognition.The first human pose branch feeds robust skeletons in the graph convolutional network to model pose features,while the second human parsing branch also leverages depictive parsing feature maps to model parsing features via convolutional backbones.The two high-level features will be effectively combined through a late fusion strategy for better action recognition.Extensive experiments on NTU RGB t D and NTU RGB t D 120 benchmarks consistently verify the effectiveness of our proposed EPP-Net,which outperforms the existing action recognition methods.Our code is available at https://github.com/liujf69/EPP-Net-Action.

关 键 词:action recognition human parsing human skeletons 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象