融合视觉感知特性的场景分类算法  被引量:4

Scene classification algorithm of fusing visual perception

在线阅读下载全文

作  者:史静[1] 朱虹[1] 王栋[1] 杜森[1] 

机构地区:[1]西安理工大学自动化与信息工程学院,西安710048

出  处:《中国图象图形学报》2017年第12期1750-1768,共19页Journal of Image and Graphics

基  金:国家自然科学基金项目(61502385;61673318);西安市科技计划项目(CXY1509(13));西安理工大学教学研究重点项目(xjy1670)~~

摘  要:目的目前对于场景分类问题,由于其内部结构的多样性和复杂性,以及光照和拍摄角度的影响,现有算法大多通过单纯提取特征进行建模,并没有考虑场景图像中事物之间的相互关联,因此,仍然不能达到一个理想的分类效果。本文针对场景分类中存在的重点和难点问题,充分考虑人眼的视觉感知特性,利用显著性检测,并结合传统的视觉词袋模型,提出了一种融合视觉感知特性的场景分类算法。方法首先,对图像进行多尺度分解,并提取各尺度下的图像特征,接着,检测各尺度下图像的视觉显著区域,最后,将显著区域信息与多尺度特征进行有机融合,构成多尺度融合窗选加权SIFT特征(WSSIFT),对场景进行分类。结果为了验证本文算法的有效性,该算法在3个标准数据集SE、LS以及IS上进行测试,并与不同方法进行比较,分类准确率提高了约3%17%。结论本文提出的融合视觉感知特性的场景分类算法,有效地改善了单纯特征描述的局限性,并提高了图像的整体表达。实验结果表明,该算法对于多个数据集都具有较好的分类效果,适用于场景分析、理解、分类等机器视觉领域。Objective Scene classification is an important part of machine vision. The content of scene is identified by analyzing the objects in the scene and its relative position. In recent years, the amount of image surged has introduced great challenges in image recognition, retrieval, and classification. Accurately obtaining the information needed by users for processing vast data is becoming increasingly urgent in this field. Early image recognition technology has focused mainly on describing the low-level information of images. The bag-of-words model is applied in document processing. This model transforms the document to a combination of keywords first and then conducts matching on the basis of the frequency of keywords. In recent years, this method has been applied to image processing successfully by researchers in computer vision. The image is represented to the document in the bag-of-words model. The visual words of image can be generated by image feature extraction, and the bag-of-words of image can be completed on the basis of the frequency of visual words. At present, an ideal classification effect cannot be achieved easily because of the diversity and complexity of the internal structure of scene classification. Physiological and psychological research has shown that the human visual system pays more attention to significant regions than significant points, and these regions are referred to as saliency regions. Visual attention model is a new major topic in research. Saliency analysis finds the region with most interests and most content of the image by use of a certain calculation method and represents with a saliency figure. In this study, a scene classification algorithm based on visual perception is proposed to address the key and difficult problems in scene classification. Specifically, the visual perception characteristics of the human eyes are considered and significance detection combined with traditional bag-of-visual-words model is employed. Method On the basis of visual significance and phonetic mo

关 键 词:视觉感知 场景分类 多尺度 特征融合 WSSIFT特征 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象