一种用于图像检索的多层语义二值描述符  被引量:5

Multi-level Semantic Binary Descriptor for Image Retrieval

在线阅读下载全文

作  者:吴泽斌 于俊清[1,2] 何云峰 管涛[1] WU Ze-Bin;YU Jun-Qing;HE Yun-Feng;Guan Tao(Department of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074;Center of Network and Computation,Huazhong University of Science and Technology,Wuhan 430074)

机构地区:[1]华中科技大学计算机科学与技术学院,武汉430074 [2]华中科技大学网络与计算中心,武汉430074

出  处:《计算机学报》2020年第9期1641-1655,共15页Chinese Journal of Computers

基  金:国家自然科学基金(61572211,61173114,61202300)资助.

摘  要:随着图像数据的爆炸性增长,基于内容的图像检索引起了大量的关注.图像检索系统的性能很大程度上是由描述符决定的.有很多传统的描述符先后被提出,但检索的准确率都不太理想.随着深度学习的发展,利用卷积神经网络(Convolutional Neural Network,CNN)来学习占用空间小且具有较强区分力(discriminative)的图像表示逐渐兴起.卷积神经网络全连接层的特征通常为分类任务而设计,捕获的往往是高层的语义信息,难以充分有效的捕获图像的局部信息,而且维度很高.为解决全连接层特征缺乏局部信息且维度较高的问题,本文提出了一种多层语义二值描述符(Multi-level Semantic Binary Descriptor,MSBD).多层语义二值描述符通过多层语义浮点描述符构建和二值描述符学习两个步骤生成.多层语义浮点描述符由全局分支、对象分支以及显著性区域分支构成,每个分支代表一个语义层次,可以同时捕获全局特征以及显著的局部特征.二值描述符学习算法通过一个迭代的过程减少二值化过程中的量化误差以及编码中的冗余信息,在压缩描述符的同时减少区分力的损失.为了进一步提高查询的准确率,本文提出了一种不相似性度量函数.此度量函数同时包含了哈希代表的视觉语义信息以及类级别的高层概念语义信息.本文在该领域典型的数据集上对描述符进行了系统的对比实验,实验结果表明,多层语义二值描述符具有很强的区分力,查询准确率优于很多当前最先进的浮点描述符,在Oxford5K数据集上与目前最好的方法达到了相近的准确率,在Paris6K数据集上比已有的方法超过了约4.3%,在Holidays数据集上比已有方法超过了约2.1%.As the explosive growing of the multimedia data on the Internet,finding an interesting image meeting the user query demand is becoming more and more difficult today,and content-based image retrieval,which aims to find the database images similar to a query image given by the user,is attracting increasing attention.The performance of an image retrieval system is largely decided by the image descriptor used.A lot of traditional shallow image descriptor building frameworks have been proposed,however,the accuracy they achieve on image retrieval benchmark datasets is not satisfying because of the limited representation ability of the shallow descriptors.With the advent of deep learning,making use of convolutional neural network to learn compact and discriminative representation has attracted considerable interest recently,because the learning ability of convolutional neural network is very strong given enough training data and supervision information.Many methods usually use the fully-connected layer feature to generate the representation for image retrieval,because the features from the fully-connected layers are relatively informative compared with the former layers.However,convolutional neural network is usually trained for classification task,and the features from fully-connected layers of convolutional neural network usually capture high-level semantic information and lack sufficient local characteristics of the input image,the discriminative ability of the image descriptor is affected by this reason.What’s more,the features from fully-connected layers are usually not so compact and consume lots of storage,the scalability is limited.To address this problem,we propose a multi-level semantic binary descriptor building method which can capture global and salient local features simultaneously.Instead of a popular end-to-end approach,our binary descriptor building method is composed of two stages:multi-level semantic real-valued descriptor building and binary codes learning.The multi-level semantic real-valued descr

关 键 词:图像表示 卷积神经网络 不相似性度量 图像检索 多层语义二值描述符 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象