检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄韵文 王斐 李景宏[1] 王国锐 Huang Yunwen;Wang Fei;Li Jinghong;Wang Guorui(College of Information Science and Engineering,Northeastern University,Shenyang,Liaoning 110819,China;Faculty of Robot Science and Engineering,Northeastern University,Shenyang,Liaoning 110169,China)
机构地区:[1]东北大学信息科学与工程学院,辽宁沈阳110819 [2]东北大学机器人科学与工程学院,辽宁沈阳110169
出 处:《中国激光》2019年第11期270-278,共9页Chinese Journal of Lasers
摘 要:针对时序动作选举任务,设计一种两段式动作候选区域选举网络。第一段将改进的分水岭算法应用于一维时序信号,通过浸水聚类产生多种不同长度的候选区域,实现动作时序边界的粗定位,进而提出一种时序金字塔结构化方法,引入动作片段的上下文信息模块,对候选区域的主体信息和上下文信息进行结构化建模,生成一个增强的全局特征。第二段利用时序坐标回归算法定位动作边界,同时加入动作/背景分类器过滤背景候选区域,得到更加精确的时序边界。整个网络以三维卷积神经网络(C3D)提取的单元级特征进行训练,挖掘了视频时域和空域的丰富语义,在提升算法精度的同时大大提升了训练效率。在两大基准数据集Thumos 14和ActivityNet上进行测试,结果表明,与已有方法相比,两段式视频时序动作选举算法达到了最优平均召回率,可有效提高动作定位的精度。A two-stage action-candidate regional proposal network is designed herein for a temporal action detection task. The first stage applies a modified watershed algorithm to an one-dimensional temporal signal to form candidate regions with different lengths by immersion clustering, which obtains a rough localization of action temporal boundary. Then, a temporal pyramid structural method is introduced to model the structure of action instances and their contextual information, generating an enhanced global feature. The second stage performs a temporal-coordinate regression algorithm to local the action boundary, and simultaneously a classifier for the action and boundary is added to filter the candidate regions of background for obtaining a more accurate temporal boundary. Furthermore, an unit-level feature extracted by a three-dimensional convolution neural network(C3 D) is used to train the entire two-stage proposal algorithm, which contains both spatial and temporal information and considerably improves training efficiency while improving the accuracy of the algorithm. Experiments on two large-scale benchmark datasets, Thumos 14 and ActivityNet, show that the proposed approach achieves the optimal average recall rate over other state-of-the-art methods, indicating that this method can efficiently improve the precision of an action localization task.
关 键 词:机器视觉 视频时序检测 动作定位 金字塔池化 时序上下文
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.45.170