基于CLSVSM的惩罚性矩阵分解及其在文本主题聚类中的应用  被引量:1

Penalized Matrix Decomposition Based on CLSVSM and Its Application in Text Topic Clustering

在线阅读下载全文

作  者:牛奉高[1] 冯世佳 黄琛 NIU Feng-gao;FENG Shi-jia;HUANG Chen(School of Mathematical Sciences,Shanxi University, Taiyuan 030006, China)

机构地区:[1]山西大学数学科学学院,山西太原030006

出  处:《计算机与现代化》2021年第5期66-72,共7页Computer and Modernization

基  金:山西省应用基础研究计划项目(优秀青年基金)(201801D211002);全国统计科学研究项目(2017LY04);山西省高等学校优秀成果培育项目(2019KJ004)。

摘  要:文本信息的合理表示对文本主题聚类及检索有重要作用。针对文本表示模型维度较高的问题,基于共现潜在语义向量空间模型(CLSVSM)研究惩罚性矩阵分解(PMD),利用PMD对向量进行稀疏约束,提取核心特征词,进而实现原始数据的重建;通过共现分析理论及PMD方法,深度挖掘特征词之间的语义信息,构建语义核函数(PMD_K)。将本文方法应用于文本主题聚类中,实验结果显示,PMD和PMD_K这2种方法的聚类效果均明显优于其他方法,以F值为例,PMD_K方法较以往的95%CLSVSM_K方法,F值提高了21.9%。将PMD与文本表示模型相结合,在提高了文本主题聚类的效率和精度的同时,还避免了对高维矩阵的复杂运算。Reasonable representation of text information plays an important role in text topic clustering and retrieval.Aiming at the problem of high dimension of text representation model,penalized matrix decomposition(PMD)is studied based on the co-occurrence potential semantic vector space model(CLSVSM),and the vector is sparsely constrained by PMD to extract core features,so as to realize the reconstruction of original data.Through co-occurrence analysis theory and PMD method,the semantic information between features is deeply mined and the semantic kernel function(PMD_K)is constructed.The methods proposed in this paper are applied to text topic clustering,the experimental results show that the clustering effect of PMD and PMD_K is obviously better than that of other methods.Taking the F value as an example,the F value of PMD_K method is 21.9%higher than that of the previous 95%CLSVSM_K method.Combining PMD with text representation model not only improves the efficiency and accuracy of text topic clustering,but also avoids the complex computation of high-dimensional matrix.

关 键 词:CLSVSM 惩罚性矩阵分解 语义核函数 文本主题聚类 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象