融合词袋连通图的图像检索特征选择  被引量:2

Feature selection method for image retrieval based on connected graphs and bag of words

在线阅读下载全文

作  者:李国祥 王继军[2,3] 马文斌[1,2] Li Guoxiang;Wang Jijun;Ma Wenbin(Department of Academic Affairs,Guangxi University of Finance and Economics,Nanning 530003,China;Guangxi Key Laboratory of Multi-Source Information Mining&Security,Guangxi Normal University,Guilin 541004,China;Department of Information and Statistics,Guangxi University of Finance and Economics,Nanning 530003,China)

机构地区:[1]广西财经学院教务处,南宁530003 [2]广西师范大学广西多源信息挖掘与安全重点实验室,桂林541004 [3]广西财经学院信息与统计学院,南宁530003

出  处:《中国图象图形学报》2021年第10期2533-2544,共12页Journal of Image and Graphics

基  金:国家自然科学基金项目(71862003);广西重点研发计划项目(2018AB15003);广西多源信息挖掘与安全重点实验室开放基金项目(MIMS17-02);广西高校中青年教师基础能力提升资助项目(2021KY0650,2019KY0661);广西跨境电商智能信息处理重点实验室培育基地(广西财经学院)专项资助项。

摘  要:目的随着图像检索所依赖的特征愈发精细化,在提高检索精度的同时,也不可避免地产生众多非相关和冗余的特征。针对在大规模图像检索和分类中高维度特征所带来的时间和空间挑战,从减少特征数量这一简单思路出发,提出了一种有效的连通图特征点选择方法,探寻图像检索精度和特征选择间的平衡。方法基于词袋模型(bag of words,BOW)的图像检索机制,结合最近邻单词交叉核、特征距离和特征尺度等属性,构建包含若干个连通分支和平凡图的像素级特征分离图,利用子图特征点的逆文本频率修正边权值,从各连通分量的节点数量和孤立点最近邻单词相关性两个方面开展特征选择,将问题转化为在保证图像匹配精度情况下,最小化特征分离图的阶。结果实验采用Oxford和Paris公开数据集,在特征存储容量、时间复杂度集和检索精度等方面进行评估,并对不同特征抽取和选择方法进行了对比。实验结果表明选择后的特征数量和存储容量有效约简50%以上;100 k词典的KD-Tree查询时间减少近58%;相对于其他编码方法和全连接层特征,Oxford数据集检索精度平均提升近7.5%;Paris数据集中检索精度平均高于其他编码方法4%,但检索效果不如全连接层特征。大量实验表明了大连通域的冗余性和孤立点的可选择性。结论通过构建特征分离图,摒弃大连通域的冗余特征点,保留具有最近邻单词相关性的孤立特征点,最终形成图像的精简特征点集。整体检索效果稳定,其检索精度基本与原始特征点集持平,且部分类别效果优于原始特征和其他方法。同时,选择后特征的重用性好,方便进一步聚合集成。Objective Features have to be more refined to improve the accuracy of image retrieval.As a result,a large amount of irrelevant and redundant features is also produced inevitably,which leads to a high requirement of memory and computation,especially for large-scale image retrieval.Thus,feature selection plays a critical role in image retrieval.Based on the principle of feature number reduction,we propose a novel,effectively connected component feature selection method and explore a tradeoff between image retrieval accuracy and feature selection in this paper.Method First,we construct a pixel-level feature separate graph that contains several connected branches and trivial graphs based on the bag of words(BOW)principle by combining different characteristics such as nearest word cross kernel,feature distance,and feature scale.Then,we calculate the cross kernel among the first D nearest neighbor words of each feature point.If the crossing set is empty and the distance and scale between feature points satisfy the established conditions,we assume that these two feature points belong to the same group.Then,we select features according to the node number of each connected component and the correlation of nearest words of isolated points.In this process,we use inverse document frequency as the weight of the first D nearest neighbor words to measure the contribution.Finally,we transform the problem to minimize the network order of the feature separated graph in guaranteeing the accuracy of image matching and select feature points from isolated point and connected branches.If the maximum cross kernel of the isolated point with other points is greater than the thresholdn,then we retain it as a valid feature point.If the connected component of the graph is less than the preset thresholdγ,then we retain these points in the connected branch as valid feature points.Result We adopt the public Oxford datasets and Paris datasets,and evaluate the proposed method on the aspects of feature restore requirement,time complex set,and ret

关 键 词:词袋模型(BOW) 特征选择 图像检索 连通分量 聚合特征 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象