检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:艾列富 陶勇 蒋常玉 AI Liefu;TAO Yong;JIANG Changyu(School of Computer and Information,Anqing Normal University,Anqing Anhui 246133,China;School of Smart Transportation Modern Industry,Anhui Sanlian University,Hefei Anhui 230601,China)
机构地区:[1]安庆师范大学计算机与信息学院,安徽安庆246133 [2]安徽三联学院智慧交通现代产业学院,安徽合肥230601
出 处:《图学学报》2024年第3期472-481,共10页Journal of Graphics
基 金:安徽省自然科学基金项目(1608085MF144,1908085MF194);安徽省高校自然科学研究重点项目(KJ2020A0498)。
摘 要:图像描述符是计算机视觉任务重要研究对象,被广泛应用于图像分类、分割、识别与检索等领域。深度图像描述符在局部特征提取分支缺少高维特征的空间与通道信息的关联性,导致局部特征表达的信息不充分。为此,提出一种融合局部、全局特征的图像描述符,在局部特征提取分支进行膨胀卷积提取多尺度特征图,输出的特征拼接后经过含有多层感知器的全局注意力机制捕捉具有关联性的通道-空间信息,再加工后输出最终的局部特征;高维的全局分支经过全局池化和全卷积生成全局特征向量;提取局部特征在全局特征向量上的正交值与全局特征串联后聚合形成最终的描述符。同时,在特征约束方面,使用包含子类心的角域度损失函数增大模型在大规模数据集的鲁棒性。在国际公开数据集Roxford5k和Rparis6k上进行实验,所提出描述符的平均检索精度在medium和hard模式分别为81.87%和59.74%以及91.61%和79.12%,比深度正交融合描述符分别提升了1.70%,1.56%,2.00%和1.83%,较其他图像描述符具有更好的检索精度。Image descriptors are important research objects in computer vision tasks and are widely applied to the fields of image classification,segmentation,recognition,and retrieval.The depth image descriptor lacks the correlation between the high-dimensional feature space and channel information in the local feature extraction branch,resulting in insufficient information for local feature expression.Therefore,an image descriptor combining local and global features was proposed.The multi-scale feature map was extracted through dilated convolution in the local feature extraction branch.After the output features were spliced,the relevant channel-space information was captured through a global attention mechanism with a multilayer perceptron.Then the final local features were output after processing.The high-dimensional global branches generated global feature vectors through global pooling and full convolution.The orthogonal values of local features were extracted on the global feature vector,and were then concatenated with the global features to form the final descriptor.At the same time,the robustness of the model in large-scale datasets were enhanced by employing the angular domain loss function containing the sub-class center.The experimental results on the publicly available datasets Roxford5k and Rparis6k demonstrated that in medium and hard modes,the average retrieval accuracy of this descriptor reached 81.87%and 59.74%,and 91.61%and 79.12%,respectively.This represented an improvement of 1.70%and 1.56%,and 2.00%and 1.83%compared to that of deep orthogonal fusion descriptors.It exhibited superior retrieval accuracy over other image descriptors.
关 键 词:图像描述符 膨胀卷积 全局注意力 特征融合 子类心角度域损失
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49