基于预训练表示学习的端到端跨媒体检索方法

An End-to-End Cross-Media Retrieval Method Based on Pre-Trained Representation Learning

作　　者：刘桐彤屈丹[1] LIU Tongtong;QU Dan(Information Engineering University,Zhengzhou 450001,China)

出　　处：《信息工程大学学报》2022年第5期563-569,共7页Journal of Information Engineering University

基　　金：国家自然科学基金资助项目(62171470,61673395)。

摘　　要：跨媒体检索是集成媒体数据表示学习与媒体数据信息对齐的检索方式,现有的跨媒体表示学习方法没有将单一媒体数据表示学习的先进方法集成应用,在跨媒体信息高层表示方面缺乏有效语义对齐。提出一种预训练表示学习的端到端跨媒体检索方法,该方法采用先进的预训练表示学习方法,分别利用残差网络(ResNet)和BERT模型抽取图像和文本高层表示特征,然后利用自注意力机制挖掘跨媒体数据的语义关联,实现跨媒体信息的语义对齐。以平均精度均值作为评价指标,在3个广泛使用的跨媒体数据集上验证了模型的有效性。实验表明,所提方法在3个数据集上的平均精度均值都优于其他几种对比方法。Cross-media retrieval is a retrieval method of integrating media data representation learning and media data information alignment. The existing cross-media representation learning method does not integrate the state-of-the-art methods of single media data representation learning and lacks effective semantic alignment in the aspect of high level representation of transmedia information. In this paper, an end-to-end cross-media retrieval method based on pre-trained representation learning is proposed. This method adopts advanced representation learning method, uses residual network(ResNet) to extract image features and bidirectional encoder representation from transfromers(BERT) model to extract text features respectively. Self-attention mechanism is used to mine semantic associations of cross-media data to achieve semantic alignment of cross-media information. We take the mean average precision as the evaluation metric and verify the validity of the model on three widely used cross-media datasets. Experimental results show that the average accuracy of the proposed method is superior to other methods in all three data sets.

关键词：跨媒体检索表示学习 ResNet BERT 自注意力

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于预训练表示学习的端到端跨媒体检索方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于预训练表示学习的端到端跨媒体检索方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索