基于投影寻踪降维的文本特征可视化  被引量:3

Projection-pursuit-based dimension reduction for visualization of text features

在线阅读下载全文

作  者:高茂庭[1] 陆鹏[1] 

机构地区:[1]上海海事大学信息工程学院,上海200135

出  处:《计算机应用》2008年第6期1411-1413,1416,共4页journal of Computer Applications

基  金:国家自然科学基金资助项目(60275020);上海市教委科研项目(06FZ007);上海海事大学重点学科建设项目(XL0101)

摘  要:利用遗传算法优化投影方向,投影寻踪模型将高维的文本特征数据投影到2~3维的低维可视化空间上,并根据高维数据在这个低维空间当中的投影特征值来反映其线性和非线性结构或特征,达到数据降维目的并实现文本数据特征可视化。不仅大大约简了文本挖掘过程的计算复杂性,还有助于在K-means聚类算法中确定初始中心点数目,提高算法精度。实验验证了这种方法应用于文本特征降维的有效性。Using genetic algorithm to search for the optimal projecting direction, projection pursuit model was used to project text feature data from high-dimensional space into low-dimensional space (2 or 3 dimensions ), and the linear and nonlinear structures and features of the high-dimensional data were shown by its projecting feature value in the low dimensional space, therefore dimensionality was reduced and visualization for high-dimensional text feature data was realized. This method is not only cutting down the computing complexity in the process of text mining, but also helping to determine the number of initial center point for K-means algorithm, and improving the accuracy of the algorithm. Experiments demonstrate the efficiency of this method for text feature dimension reduction.

关 键 词:投影寻踪 降维 文本挖掘 遗传算法 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象