检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈朋 王顺 党源杰 宦若虹[1] CHEN Peng;WANG Shun;DANG Yuan-jie;HUAN Ruo-hong(College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China;College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China)
机构地区:[1]浙江工业大学计算机科学与技术学院,杭州310023 [2]浙江工业大学信息工程学院,杭州310023
出 处:《小型微型计算机系统》2022年第8期1739-1745,共7页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(U1909203)资助;浙江省自然科学基金项目(LY19F020032)资助;浙江省属高校基本科研业务费专项资金项目(RF-C2019001)资助.
摘 要:在视频理解任务中,为了减少行为检测任务中的数据标注成本同时提高检测精度,本文提出一种基于骨骼数据的弱监督视频行为检测方法,使用视频级的类别标注对行为检测网络进行弱监督训练.本文以二维人体骨骼数据和RGB图像数据作为网络输入,利用循环神经网络从骨骼数据中提取时域信息并送入全连接层输出所需的特征.骨骼数据提取的特征与RGB数据提取的特征分别传入注意力网络生成相应的权重,用来生成加权特征与加权时序类别激活图值.最后根据加权特征与加权时序类别激活图值进行行为的分类与时域定位.实验结果表明,所提出的结合人体骨骼数据的算法比有监督算法少使用了数据的时间标注.算法在THUMOS14数据集和ActivityNet1.3数据集上能够提高检测准确率.In the video understanding task,in order to reduce the cost of data annotation in action detection tasks and improve detection accuracy,this paper proposes a weakly supervised video action detection method based on skeleton data,which uses video-level category annotations to perform weakly supervised training on action detection networks.In this paper,two-dimensional human skeleton data and RGB image data are used as the network input,and the cyclic neural network is used to extract time domain information from the skeleton data and send it to the fully connected layer to output the required features.The features extracted from the skeleton data and the features extracted from the RGB data are sent to the attention network to generate corresponding weights,which are used to generate weighted features and weighted time series category activation map values.Finally,according to the weighted characteristics and weighted time series category activation graph values,action classification and time domain positioning are performed.Experimental results show that the proposed algorithm combining human skeleton data uses less time labeling of data than supervised algorithms.The algorithm can improve the detection accuracy on the THUMOS14 data set and ActivityNet1.3 data set.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.4