基于稀疏约束和对偶图正则化的受限概念分解算法及在数据表示中的应用  

Constrained Concept Factorization Based on Sparseness Constraints and Dual Graph Regularization for Data Representation

在线阅读下载全文

作  者:翁宗慧 由从哲 

机构地区:[1]江苏理工学院,计算机工程学院,江苏 常州

出  处:《计算机科学与应用》2022年第4期1031-1042,共12页Computer Science and Application

摘  要:概念分解算法(CF)是一种经典的数据表达方式,已经被广泛使用于机器视觉、模式识别等领域。基本的CF方法是一种无监督的学习算法,无法利用数据中存在的先验知识,没有考虑数据空间流形和特征空间流形的几何结构信息,同时分解结果也不具有稀疏性。为了解决以上缺陷,本文提出了一种基于稀疏约束和对偶图正则化的受限概念分解算法(DCCFS)。该算法通过保持样本数据空间和特征空间中内蕴的几何结构信息不变,使得算法可以更加有效提取数据的特征,增强了算法的数据表达能力;利用数据中天然存在的类别性息,增强算法的鉴别能力;添加LP平滑范数提高了算法的稀疏性,使得分解结果更加准确、平滑。在COIL20图像数据集、PIE人脸数据集以及TDT2文本数据集上的聚类实验证明本文提出的DCCFS的聚类性能优于其他同类算法。Concept decomposition algorithm (CF) is a classical data representation that has been widely used in machine vision, pattern recognition and other fields. In response to the fact that the basic CF method is an unsupervised learning algorithm that does not consider the geometric structure information and the class information of the samples present in the data space and feature space, and also does not take into account the sparsity of the decomposition results, this paper proposes a novel method named constrained concept factorization based on sparseness constraints and dual graph regularization for data representation (DCCFS) to overcome the above defects. This method constructs the geometric structure information in the sample data space and feature space unchanged, which extracts the features of the data more effectively and enhances the data expression ability of the algorithm;by using the natural label information in the data to enhance the identification ability of the algorithm;DCCFS adds the smooth sparse constraint to make the matrix factorization process more stable, smooth, which makes sure that the results are more accurate. The experimental results on COIL20 image dataset, PIE face dataset and TDT2 text dataset show that the DCCFS method can provide better representation for high-dimensional data and effectively improve the clustering performance.

关 键 词:概念分解 标签信息 对偶图正则化 LP平滑范数 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象