半配对的多模态询问哈希方法

Semi-paired Multi-modal Query Hashing Method

作　　者：庾骏马江涛咸阳侯瑞霞[2] 孙伟[3] YU Jun;MA Jiangtao;XIAN Yang;HOU Ruixia;SUN Wei(College of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450000,China;Research Institute of Resource Information Techniques,CAF,Beijing,100091,China;Agricultural Information Institute of CAAS,Beijing 100081,China)

机构地区：[1]郑州轻工业大学计算机与通信工程学院,郑州450000 [2]中国林业科学研究院资源信息研究所,北京100091 [3]中国农业科学院农业信息研究所,北京100081

出　　处：《电子与信息学报》2024年第2期481-491,共11页Journal of Electronics & Information Technology

基　　金：国家自然科学基金(32271880);河南省科技攻关项目基金(222102210064);河南省自然科学基金(232300420150)。

摘　　要：多模态哈希能够将异构的多模态数据转化为联合的二进制编码串。由于其具有低存储成本、快速的汉明距离排序的优点,已经在大规模多媒体检索中受到了广泛的关注。现有的多模态哈希方法假设所有的询问数据都具备完整的多种模态信息以生成它们的联合哈希码。然而,实际应用中很难获得全完整的多模态信息,针对存在模态信息缺失的半配对询问场景,该文提出一种新颖的半配对询问哈希(SPQH),以解决半配对的询问样本的联合编码问题。首先,提出的方法执行投影学习和跨模态重建学习以保持多模态数据间的语义一致性。然后,标签空间的语义相似结构信息和多模态数据间的互补信息被有效地捕捉以学习判别性的哈希函数。在询问编码阶段,通过学习到的跨模态重构矩阵为未配对的样本数据补全缺失的模态特征,然后再经习得的联合哈希函数生成哈希特征。相比最先进的基线方法,在Pascal Sentence,NUS-WIDE和IAPR TC-12数据集上的平均检索精度提高了2.48%。实验结果表明该算法能够有效编码半配对的多模态询问数据,取得了优越的检索性能。Multimodal hashing can convert heterogeneous multimodal data into unified binary codes.Due to its advantages of low storage cost and fast Hamming distance sorting,it has attracted widespread attention in large-scale multimedia retrieval.Existing multimodal hashing methods assume that all query data possess complete multimodal information to generate their joint hash codes.However,in practical applications,it is difficult to obtain fully complete multimodal information.To address the problem of missing modal information in semi-paired query scenarios,a novel Semi-paired Query Hashing(SPQH)method is proposed to solve the joint encoding problem of semi-paired query samples.Firstly,the proposed method performs projection learning and cross-modal reconstruction learning to maintain semantic consistency among multimodal data.Then,the semantic similarity structure information of the label space and complementary information among multimodal data are effectively captured to learn a discriminative hash function.During the query encoding stage,the missing modal features of unpaired sample data are completed using the learned cross-modal reconstruction matrix,and then the hash features are generated using the learned joint hash function.Compared to state-ofthe-art baseline methods,the average retrieval accuracy on the Pascal Sentence,NUS-WIDE,and IAPR TC-12 datasets has improved by 2.48%.Experimental results demonstrate that the algorithm can effectively encode semi-paired multimodal query data and achieve superior retrieval performance.

关键词：多模态信息检索哈希半配对数据跨模态重建二值化编码

分类号：TN911.7[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

半配对的多模态询问哈希方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

半配对的多模态询问哈希方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索