检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:何雨霖 彭淑娟[1,2] 柳欣 崔振[3] He Yulin;Peng Shujuan;Liu Xin;Cui Zhen(College of Computer Science and Technology,Huaqiao University,Xiamen 361021;Fujian Key Laboratory of Big Data Intelligence and Security,Xiamen 361021;Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education,Nanjing University of Science and Technology,Nanjing 210094)
机构地区:[1]华侨大学计算机科学与技术学院,厦门361021 [2]福建省大数据智能与安全重点实验室,厦门361021 [3]南京理工大学高维信息智能感知与系统教育部重点实验室,南京210094
出 处:《计算机辅助设计与图形学学报》2023年第4期503-515,共13页Journal of Computer-Aided Design & Computer Graphics
基 金:之江实验室开放课题(2021KH0AB01);福建省自然科学基金(2020J01083,2020J01084);华侨大学研究生教育教学改革项目(20YJG017)。
摘 要:由于视频骨骼数据的复杂性及语义鸿沟问题,现有的动作匹配方法无法较好地解决不同模态运动数据间的关联匹配问题.为此,提出一个面向RGB视频-三维骨骼数据的跨模态动作匹配学习方法.首先,设计跨模态动作匹配框架,挖掘RGB视频数据和骨骼序列数据间的共同语义信息;其次,引入权值共享的多模态双层残差结构和双向混合约束,用于挖掘模态间关联,从而生成共享语义嵌入的跨模态表示,极大地提高数据利用率和提升模型的性能;最后,提出弹性验证模块,促使网络在共享语义空间中专注于鉴别性动作特征的学习,有效地提升模型的泛化性能.实验结果表明,该框架可以更加有效地解决RGB视频和骨骼序列2个模态间的动作匹配任务,并在NTU-RGBD和JHMDB数据集上的跨模态ACC和MAP定量分析指数方面均优于现有3种基准算法,较好地实现了异构模态动作间的灵活跨越.The existing action matching methods cannot well solve the semantic correlation matching problem between the heterogeneous video and skeleton motion data,mainly due to the data complexity and their semantic gap.To tackle these issues,this paper presents an efficient cross-modal action matching algorithm for semantically linking the RGB Video and 3D skeleton data.Firstly,an efficient cross-modal action matching framework is carefully designed to mine the common semantic information between the RGB video data and skeleton motion data.Secondly,the dual-residual layer structure and bi-directional hybrid constraint are well employed to learn the cross-modal associations and the corresponding shared representations,featuring on greatly improving the data utilization and enhancing the model performance.Finally,an elastic verification module is effectively designed to learn the discriminative action units within the designed network.The experimental results show that the proposed framework can effectively solve the task of cross-modal action matching between the heterogeneous RGB video and skeleton sequence,and show its outstanding performance on the NTU-RGBD and JHMDB datasets,in terms of higher ACC and MAP values.
关 键 词:跨模态动作匹配 双层残差结构 双向混合约束 弹性验证
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.182.107