CMDC:一种差异互补的迭代式多维度文本聚类算法  被引量:4

CMDC:an iterative algorithm for complementary multi-view document clustering

在线阅读下载全文

作  者:黄瑞章[1,2] 白瑞娜 陈艳平 秦永彬 程欣宇[1,3] 田有亮 HUANG Ruizhang;BAI Ruina;CHEN Yanping;QIN Yongbin;CHENG Xinyu;TIAN Youliang(College of Computer Science and Technology,Guizhou University,Guiyang 550025,China;Guizhou Provincial Key Laboratory of Public Big Data,Guiyang 550025,China;Guizhou Intelligent Human-Computer Interaction Engineering Technology Research Center,Guiyang 550025,China)

机构地区:[1]贵州大学计算机科学与技术学院,贵州贵阳550025 [2]贵州省公共大数据重点实验室,贵州贵阳550025 [3]贵州省智能人机交互工程技术研究中心,贵州贵阳550025

出  处:《通信学报》2020年第8期155-164,共10页Journal on Communications

基  金:国家自然科学基金资助项目(No.61462011,No.91746116);国家自然科学基金联合基金资助项目(No.U1836205);贵州省科学技术基金资助项目(No.[2020]1Z055)。

摘  要:针对传统多维度文本聚类算法把文本表示与聚类过程分离,忽略了维度间的互补特性的问题,提出了一种差异互补的迭代式多维度文本聚类算法——CMDC,实现文本聚类与特征调整过程的统一优化。CMDC算法挑选维度聚类间结果的互补文本,基于局部度量学习算法利用互补文本促进聚类的特征调优,以维度的度量一致性来解决多维度文本聚类的划分一致性。实验结果表明,CMDC算法有效地提升了多维度聚类性能。In response to the problems traditional multi-view document clustering methods separate the multi-view document representation from the clustering process and ignore the complementary characteristics of multi-view document clustering,an iterative algorithm for complementary multi-view document clustering——CMDC was proposed,in which the multi-view document clustering process and the multi-view feature adjustment were conducted in a mutually unified manner.In CMDC algorithm,complementary text documents were selected from the clustering results to aid adjusting the contribution of view features via learning a local measurement metric of each document view.The complementary text document of the results among the dimensionality clusters was selected by CMDC,and used to promote the feature tuning of the clusters.The partition consistency of the multi-dimensional document clustering was solved by the measure consistency of the dimensions.Experimental results show that CMDC effectively improves multi-dimensional clustering performance.

关 键 词:多维度文本聚类 互补文本 约束文本聚类 度量计算 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象