基于优化可形变区域全卷积神经网络的人头检测方法  被引量:6

Head Detection Method Based on Optimized Deformable Regional Fully Convolutional Neutral Networks

在线阅读下载全文

作  者:吉训生[1] 王昊 Ji Xunsheng;Wang Hao(School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China)

机构地区:[1]江南大学物联网工程学院

出  处:《激光与光电子学进展》2019年第14期121-131,共11页Laser & Optoelectronics Progress

基  金:国家自然科学基金(61771223);江苏省重点研发计划(SBE2018334)

摘  要:人头检测技术是人数统计领域一项重要的研究内容,基于检测的人数统计方法常用于视频监控领域。人头检测常常受到遮挡、背景干扰、光照等因素影响。为解决上述问题,提出一种基于区域全卷积神经网络进行头部检测的方法。特征学习阶段通过残差网络和区域候选网络获得特征及感兴趣区域,并在残差网络中添加可形变卷积层。再将感兴趣区域输入池化层,进行可形变位置敏感均值池化。最后进行分类与目标位置精修,并提出将位置敏感感兴趣区域对齐并进行池化操作。为了改善网络在多尺度头部的检测效果,更新区域候选网络中锚点生成规则。利用在线难例挖掘算法提高复杂任务下头部目标的检测能力,通过软非极大值抑制减少检测边界框间的相互干扰。研究结果表明,在HollywoodHeads数据集上平均识别精度最高可达83.24%,优于目前相关文献的方法。Human head detection is an important research subject for counting people and is often considered to be a useful approach for video monitoring. The challenges associated with human head detection include instance occlusion, background interference, and uneven illumination;this study aims to address these challenges through a method based on the regional fully convolutional neural network. Initially, in the feature learning stage, features are acquired using a residual network (ResNet), and the region of interest is obtained through regional proposal networks. Subsequently, a deformable convolution layer is added into ResNet, and the region of interest is provided as input into the pooling layer for deformable position-sensitive mean pooling. Finally, the target location is classified and refined along with the alignment of the proposed position-sensitive region of interest to complete the pooling operation. Further, the anchor generation rules in regional proposal networks are updated to improve the detection effect of the network based on multi-scale head. The detection ability of head targets under complex tasks is improved using an online hard sample mining algorithm;subsequently, the mutual interference between the bounding boxes is reduced by the soft non-maximum suppression. After applying the proposed method to the HollywoodHeads dataset, the average recognition accuracy is confirmed to become 83. 24%, which is better than those of other methods in the current literature.

关 键 词:图像处理 区域全卷积神经网络 人头检测 可形变卷积 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象