融合多模态数据的人体动作识别方法研究被引量：2

Research on Human Action Recognition Method by Fusing Multimodal Data

作　　者：马亚彤王松[1,2] 刘英芳 MA Yatong;WANG Song;LIU Yingfang(School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China;Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic and Imaging Processing,Lanzhou 730070,China)

机构地区：[1]兰州交通大学电子与信息工程学院,兰州730070 [2]甘肃省人工智能与图形图像处理工程研究中心,兰州730070

出　　处：《计算机工程》2022年第9期180-188,共9页Computer Engineering

基　　金：国家自然科学基金(62067006);甘肃省自然科学基金(21JR7RA291);甘肃省教育科技创新项目(2021jyjbgs-05);甘肃省高校产业支撑计划项目(2020C-19)。

摘　　要：基于多模态融合的人体动作识别技术被广泛研究与应用,其中基于特征级或决策级的融合是在单一级别阶段下进行的,无法将真正的语义信息从数据映射到分类器。提出一种多级多模态融合的人体动作识别方法,使其更适应实际的应用场景。在输入端将深度数据转换为深度运动投影图,并将惯性数据转换成信号图像,通过局部三值模式分别对深度运动图和信号图像进行处理,使每个输入模态进一步转化为多模态。将所有的模态通过卷积神经网络训练进行提取特征,并把提取到的特征通过判别相关分析进行特征级融合。利用判别相关分析最大限度地提高两个特征集中对应特征的相关性,同时消除每个特征集中不同类之间的特征相关性,将融合后的特征作为多类支持向量机的输入进行人体动作识别。在UTD-MHAD和UTD Kinect V2 MHAD两个多模态数据集上的实验结果表明,多级多模态融合框架在两个数据集上的识别精度分别达到99.8%和99.9%,具有较高的识别准确率。Human action recognition technology based on multimodal fusion has been widely investigated. In this technology,feature-level or decision-level fusion is performed at a single level or stage,where actual semantic information from data cannot be mapped for classification. Hence,this paper proposes a multilevel multimodal fusion human action recognition method that is adaptable to practical application scenarios.First,depth data are converted into Depth Motion Maps(DMM),and the inertial data into signal images at the input end.Subsequently,each input mode is rendered multimodal by processing the depth motion maps and signal image via the Local Ternary Patterns(LTP)mode.Next,all the modalities are trained to extract features by a convolutional neural network,and the extracted features are fused at the feature level via Discriminant Correlation Analysis(DCA),which maximizes the correlation of corresponding features in two feature sets while eliminating feature correlation between different classes in each feature set. Finally,the fused features are used as input to a multiclass support vector machine for human action recognition.Experiments are conducted on two multimodal datasets,UTD-MHAD and UTD Kinect V2 MHAD. The experimental results show that the recognition accuracy of the proposed multilevel multimodal fusion framework is 99.8% and 99.9%on the abovementioned two datasets,respectively,both of which signify high recognition accuracy.

关键词：人体动作识别深度运动图惯性传感器局部三值模式判别相关分析

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合多模态数据的人体动作识别方法研究被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合多模态数据的人体动作识别方法研究 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

融合多模态数据的人体动作识别方法研究被引量：2