深度非对称离散跨模态哈希方法

Deep asymmetric discrete cross-modal hashing method

作　　者：王晓雨王展青[1] 熊威 WANG Xiaoyu;WANG Zhanqing;XIONG Wei(School of Science,Wuhan University of Technology,Wuhan Hubei 430070,China)

机构地区：[1]武汉理工大学理学院,武汉430070

出　　处：《计算机应用》2022年第8期2461-2470,共10页journal of Computer Applications

基　　金：中央高校基本科研业务费专项资金资助项目(2019ZY232)。

摘　　要：大多数深度监督跨模态哈希方法采用对称的方式学习哈希码,导致其不能有效利用大规模数据集中的监督信息;并且对于哈希码的离散约束问题,常采用的基于松弛的策略会产生较大的量化误差,导致哈希码次优。针对以上问题,提出深度非对称离散跨模态哈希(DADCH)方法。首先构造了深度神经网络和字典学习相结合的非对称学习框架,以学习查询实例和数据库实例的哈希码,从而更有效地挖掘数据的监督信息,减少模型的训练时间;然后采用离散优化算法逐列优化哈希码矩阵,降低哈希码二值化的量化误差;同时为充分挖掘数据的语义信息,在神经网络中添加了标签层进行标签预测,并利用语义信息嵌入将不同类别的判别信息通过线性映射嵌入到哈希码中,增强哈希码的判别性。实验结果表明,在IAPR-TC12、MIRFLICKR-25K和NUS-WIDE数据集上,哈希码长度为64 bit时,所提方法在图像检索文本时的平均精度均值(mAP)较近年来提出的先进的深度跨模态检索方法——自监督对抗哈希(SSAH)分别高出约11.6、5.2、14.7个百分点。Most deep supervised cross-modal hashing methods adopt a symmetric strategy to learn hash code,so that the supervision information in large-scale datasets cannot be used effectively.And for the problem of discrete constraints of hash code,relaxation-based strategy is typically adopted,resulting in large quantization error which leads to the sub-optimal hash code.Aiming at the above problems,a Deep Asymmetric Discrete Cross-modal Hashing(DADCH)method was proposed.Firstly,an asymmetric learning framework combining deep neural networks and dictionary learning was proposed to learn the hash code of query instances and database instances,thereby mining the supervision information of the data more effectively and reducing the training time of the model.Then,the discrete optimization algorithm was used to optimize the hash code matrix column by column to reduce the quantization error of the hash code binarization.At the same time,in order to fully mine the semantic information of the data,a label layer was added to the neural network for label prediction,and the semantic information embedding was used to embed discrimination information of different categories into the hash code through linear mapping to make the hash code more discriminative.Experimental results show that on IAPR-TC12,MIRFLICKR-25K and NUS-WIDE datasets,the mean Average Precision(mAP)of the proposed method on retrieval text by image is about 11.6,5.2 and 14.7 percentage points higher than that of the advanced deep cross-modal retrieval method-Self-Supervised Adversarial Hashing(SSAH)proposed in recent years respectively.

关键词：跨模态检索深度神经网络非对称哈希语义信息嵌入离散优化

分类号：TP391.3[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

深度非对称离散跨模态哈希方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

深度非对称离散跨模态哈希方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索