一个基于概率潜语义分析的多模态多媒体检索模型  被引量:5

Multimodal Multimedia Retrieval Model Based on Probabilistic Latent Semantic Analysis

在线阅读下载全文

作  者:张宇[1] 袁野[1] 王国仁[1] 

机构地区:[1]东北大学信息科学与工程学院,沈阳110819

出  处:《小型微型计算机系统》2015年第8期1665-1670,共6页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(61025007;61328202;61100024)资助;国家"九七三"重点基础研究发展计划项目(2011CB302200-G)资助;国家"八六三"高技术研究发展计划项目(2012AA011004)资助;中央高校基本科研业务费项目(N130504006)资助

摘  要:互联网上快速增长的多媒体信息往往包含几种不同的模态,并且在同一个多媒体文档中的这些不同形式的模态往往包含相似的含义.因此,最近多模态检索已经变成了多媒体检索领域的热点问题.提出一个基于概率潜语义分析的多模态检索模型用来完成多模态的检索.两个假设被提出:(1)同一个多媒体文档的不同模态是这个文档的多种表达方式,因此它们都表示相似的含义;(2)文本单词和图像特性是独立地被生成出来的.利用概率潜语义分析分别模拟训练集中文本和图像的生成过程并且通过期望最大化算法学习获得它们的潜在主题分布.利用多元线性回归方法分析文本表达和图像表达,并利用最小二乘法得到回归矩阵的估计.这个矩阵用于将文本和图像模态互相转换.实验表明了该方法的有效性.Nowadays,multimedia information that has explosively increased in the Internet usually consists of a variety of different modal contents and these multi-modal contents probably represent the similar senses. Thus recently the multimodal retrieval becomes the hotspot in the multimedia retrieval research. In this paper, we propose a multimodal multimedia retrieval modal based on probabilistic Latent Semantic analysis ( pLSA ) to achieve multi-modal retrieval. Two hypotheses are presented that ( 1 ) the different modal contents ( the text and image ) in one document are the representations of the different forms of this document so they represent the similar senses, and ( 2 ) the textual words and the visual features are respectively generated independently. We employ the generative model, pL- SA, to respectively simulate the generative processes of texts and images in the same documents in the training set and the topics of pLSA model are learned by EM method. Then we employ the multivariate linear regression method to analyze the correlation between representations of texts and images and use the ordinary least squares (OLS ) method to obtain the estimation of the regression matrix that can be used to transform between textual and visual modal data. Extensive experiments results demonstrate the effectiveness and efficiency of the proposed model.

关 键 词:多模态 多媒体 检索 概率潜语义分析 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象