图像-文本相关性挖掘的Web图像聚类方法  被引量:10

Clustering Web Images by Correlation Mining of Image-Text

在线阅读下载全文

作  者:吴飞[1] 韩亚洪[1] 庄越挺[1] 邵健[1] 

机构地区:[1]浙江大学计算机科学与技术学院,浙江杭州310027

出  处:《软件学报》2010年第7期1561-1575,共15页Journal of Software

基  金:国家自然科学基金Nos.60603096;60533090;国家高技术研究发展计划(863)No.2006AA010107;长江学者和创新团队发展计划Nos.IRT0652;PCSIRT~~

摘  要:为了实现Web图像检索结果的聚类,提出了一种Web图像的图聚类方法.首先定义了两种类型关联:单词与图像结点之间的异构链接以及单词结点之间的同构链接.为了克服传统的TF-IDF方法不能直接反映单词与图像之间的语义关联局限性,提出并定义了单词可见度(visibility)这一属性,并将其集成到传统的tf-idf模型中以挖掘单词-图像之间关联的权重.根据LDA(latent Dirichlet allocation)模型,单词-单词之间关联权重通过一个定义的主题相关度函数来计算.最后,应用复杂图聚类和二部图协同谱聚类等算法验证了在图模型上引入两种相关性关联的有效性,达到了改进了Web图像聚类性能的目的.To cluster the retrieval results of Web image, a framework for the clustering is proposed in this paper. It explores the surrounding text to mine the correlations between words and images and therefore the correlations are used to improve clustering results. Two kinds of correlations, namely word to image and word to word correlations, are mainly considered. As a standard text process technique, tf-idf method cannot measure the correlation of word to image directly. Therefore, this paper proposes to combine tf-idf method with a feature of word, namely visibility, to infer the correlation of word to image. Through LDA model, it defines a topic relevance function to compute the weights of word to word correlations. Finally, complex graph clustering and spectral co-clustering algorithms are used to testify the effect of introducing visibility and topic relevance into image clustering. Encouraging experimental results are reported in this paper.

关 键 词:图聚类 复杂图 可见度 LDA(latent DIRICHLET allocation) 谱聚类 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象