检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:庾骏 马江涛 咸阳 侯瑞霞[2] 孙伟[3] YU Jun;MA Jiangtao;XIAN Yang;HOU Ruixia;SUN Wei(College of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450000,China;Research Institute of Resource Information Techniques,CAF,Beijing,100091,China;Agricultural Information Institute of CAAS,Beijing 100081,China)
机构地区:[1]郑州轻工业大学计算机与通信工程学院,郑州450000 [2]中国林业科学研究院资源信息研究所,北京100091 [3]中国农业科学院农业信息研究所,北京100081
出 处:《电子与信息学报》2024年第2期481-491,共11页Journal of Electronics & Information Technology
基 金:国家自然科学基金(32271880);河南省科技攻关项目基金(222102210064);河南省自然科学基金(232300420150)。
摘 要:多模态哈希能够将异构的多模态数据转化为联合的二进制编码串。由于其具有低存储成本、快速的汉明距离排序的优点,已经在大规模多媒体检索中受到了广泛的关注。现有的多模态哈希方法假设所有的询问数据都具备完整的多种模态信息以生成它们的联合哈希码。然而,实际应用中很难获得全完整的多模态信息,针对存在模态信息缺失的半配对询问场景,该文提出一种新颖的半配对询问哈希(SPQH),以解决半配对的询问样本的联合编码问题。首先,提出的方法执行投影学习和跨模态重建学习以保持多模态数据间的语义一致性。然后,标签空间的语义相似结构信息和多模态数据间的互补信息被有效地捕捉以学习判别性的哈希函数。在询问编码阶段,通过学习到的跨模态重构矩阵为未配对的样本数据补全缺失的模态特征,然后再经习得的联合哈希函数生成哈希特征。相比最先进的基线方法,在Pascal Sentence,NUS-WIDE和IAPR TC-12数据集上的平均检索精度提高了2.48%。实验结果表明该算法能够有效编码半配对的多模态询问数据,取得了优越的检索性能。Multimodal hashing can convert heterogeneous multimodal data into unified binary codes.Due to its advantages of low storage cost and fast Hamming distance sorting,it has attracted widespread attention in large-scale multimedia retrieval.Existing multimodal hashing methods assume that all query data possess complete multimodal information to generate their joint hash codes.However,in practical applications,it is difficult to obtain fully complete multimodal information.To address the problem of missing modal information in semi-paired query scenarios,a novel Semi-paired Query Hashing(SPQH)method is proposed to solve the joint encoding problem of semi-paired query samples.Firstly,the proposed method performs projection learning and cross-modal reconstruction learning to maintain semantic consistency among multimodal data.Then,the semantic similarity structure information of the label space and complementary information among multimodal data are effectively captured to learn a discriminative hash function.During the query encoding stage,the missing modal features of unpaired sample data are completed using the learned cross-modal reconstruction matrix,and then the hash features are generated using the learned joint hash function.Compared to state-ofthe-art baseline methods,the average retrieval accuracy on the Pascal Sentence,NUS-WIDE,and IAPR TC-12 datasets has improved by 2.48%.Experimental results demonstrate that the algorithm can effectively encode semi-paired multimodal query data and achieve superior retrieval performance.
关 键 词:多模态信息检索 哈希 半配对数据 跨模态重建 二值化编码
分 类 号:TN911.7[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.239