结合分水岭和回归网络的视频时序动作选举算法被引量：1

Algorithm for Video Temporal Action Proposal Combining Watershed and Regression Networks

作　　者：黄韵文王斐李景宏[1] 王国锐 Huang Yunwen;Wang Fei;Li Jinghong;Wang Guorui(College of Information Science and Engineering,Northeastern University,Shenyang,Liaoning 110819,China;Faculty of Robot Science and Engineering,Northeastern University,Shenyang,Liaoning 110169,China)

机构地区：[1]东北大学信息科学与工程学院,辽宁沈阳110819 [2]东北大学机器人科学与工程学院,辽宁沈阳110169

出　　处：《中国激光》2019年第11期270-278,共9页Chinese Journal of Lasers

摘　　要：针对时序动作选举任务,设计一种两段式动作候选区域选举网络。第一段将改进的分水岭算法应用于一维时序信号,通过浸水聚类产生多种不同长度的候选区域,实现动作时序边界的粗定位,进而提出一种时序金字塔结构化方法,引入动作片段的上下文信息模块,对候选区域的主体信息和上下文信息进行结构化建模,生成一个增强的全局特征。第二段利用时序坐标回归算法定位动作边界,同时加入动作/背景分类器过滤背景候选区域,得到更加精确的时序边界。整个网络以三维卷积神经网络(C3D)提取的单元级特征进行训练,挖掘了视频时域和空域的丰富语义,在提升算法精度的同时大大提升了训练效率。在两大基准数据集Thumos 14和ActivityNet上进行测试,结果表明,与已有方法相比,两段式视频时序动作选举算法达到了最优平均召回率,可有效提高动作定位的精度。A two-stage action-candidate regional proposal network is designed herein for a temporal action detection task. The first stage applies a modified watershed algorithm to an one-dimensional temporal signal to form candidate regions with different lengths by immersion clustering, which obtains a rough localization of action temporal boundary. Then, a temporal pyramid structural method is introduced to model the structure of action instances and their contextual information, generating an enhanced global feature. The second stage performs a temporal-coordinate regression algorithm to local the action boundary, and simultaneously a classifier for the action and boundary is added to filter the candidate regions of background for obtaining a more accurate temporal boundary. Furthermore, an unit-level feature extracted by a three-dimensional convolution neural network(C3 D) is used to train the entire two-stage proposal algorithm, which contains both spatial and temporal information and considerably improves training efficiency while improving the accuracy of the algorithm. Experiments on two large-scale benchmark datasets, Thumos 14 and ActivityNet, show that the proposed approach achieves the optimal average recall rate over other state-of-the-art methods, indicating that this method can efficiently improve the precision of an action localization task.

关键词：机器视觉视频时序检测动作定位金字塔池化时序上下文

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

结合分水岭和回归网络的视频时序动作选举算法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

结合分水岭和回归网络的视频时序动作选举算法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

结合分水岭和回归网络的视频时序动作选举算法被引量：1