多元信息聚合的人群密度估计与计数  被引量:2

A multivariate information aggregation method for crowd density estimation and counting

在线阅读下载全文

作  者:刘光辉[1] 王秦蒙 陈宣润 孟月波[1] LIU Guanghui;WANG Qinmeng;CHEN Xuanrun;MENG Yuebo(School of Information and Control Engineering,Xi′an University of Architecture and Technology,Xi′an 710055,China;Zhongke Xingtu Spatial Data Technology Co.,Ltd.,Xi′an 710199,China)

机构地区:[1]西安建筑科技大学信息与控制工程学院,陕西西安710055 [2]中科星图空间技术有限公司,陕西西安710199

出  处:《光学精密工程》2022年第10期1228-1239,共12页Optics and Precision Engineering

基  金:自然科学基础研究计划面上项目(No.2020JM-473,No.2020JM-472);陕西省重点研发计划项目(No.2021SF-429)。

摘  要:人群密度估计与计数是指对拥挤场景中人群分布及数量进行统计,对安全系统、交通控制等具有重要意义。针对高密度图像在人群密度估计中特征提取困难、空间语义信息获取较难、特征融合不充分等问题,本文提出一种多元信息聚合人群密度估计方法(Multivariate information aggregation,MIA)。首先,设计多元信息提取网络,采用VGG-19作为骨架网络提高特征提取深度,利用多层语义监督策略编码低层特征方式提高低层特征的语义表达,通过空间信息嵌入丰富高层特征空间信息表征;其次,设计多尺度上下文信息聚合网络,通过两个带有步长卷积的轻量级空洞空间金字塔池化(Simplify-atrous spatial pyramid pooling,S-ASPP)结构在进行全局多尺度上下文信息聚合的同时缓解模型参数冗余;最后,网络末端采用步长卷积,在不影响精度的前提下加快网络运行速度。采用ShanghaiTech、UCF-QNRF、NWPU数据集进行对比实验,实验结果表明:在典型数据集ShanghaiTech的Part_A部分上的MAE、MSE分别为59.4、96.2,Part_B部分分别为7.7、11.9;超高密度多视角场景数据集UCF-QNRF的MAE为89.3,MSE为164.5;NWPU数据集的MAE为87.9,MSE为417.2。本文方法较对比方法性能有一定提升,且实际场景应用结果验证了本文方法效果较好。In crowd density estimation,the crowd distribution and quantity in a crowded scene are count⁃ed,which is vital to safety systems and traffic control.A multivariate information aggregation method is proposed herein to solve difficult feature extractions,difficult spatial semantic information acquisitions,and insufficient feature fusions in the crowd density estimation of high-density images.First,a multi-infor⁃mation extraction network is designed,where VGG-19 is used as a skeleton network to enhance the depth of feature extraction,and a multilayer semantic surveillance strategy is adopted to encode low-level fea⁃tures to improve the semantic representation of low-level features.Second,a multiscale contextual infor⁃mation aggregation network is designed based on spatial information embedded into the high-level feature space,and two lightweight spatial pyramiding structures with step-size convolution are applied to reduce the redundancy of model parameters during global multiscale context information aggregation.Finally,step convolution is performed at the end of the network to accelerate the network operation without affect⁃ing the precision.The ShanghaiTech,UCF-QNRF,and NWPU datasets are applied for a comparison ex⁃periment.The experimental results demonstrate that the MAE and MSE of Part_A of the ShanghaiTech dataset are 59.4 and 96.2,respectively,whereas those of Part_B are 7.7 and 11.9,respectively.The ul⁃tradense multiview-scene UCF-QNRF dataset indicates an MAE and MSE of 89.3 and 164.5,respective⁃ly.The high-density NWPU dataset indicates an MAE and MSE of 87.9 and 417.2,respectively.The proposed method performs better than the comparison method,as indicated by actual application results.

关 键 词:人群密度估计 语义监督 空间信息嵌入 信息聚合 步长卷积 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象