检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]华侨大学计算机科学与技术学院,福建厦门361021
出 处:《计算机应用》2014年第6期1613-1617,1644,共6页journal of Computer Applications
基 金:福建省自然科学基金资助项目(2012J01274);华侨大学高层次人才科研启动基金资助项目(09BS515)
摘 要:目前对高维数据进行挖掘的方法大多是基于数学理论而非可视化的直觉。为便于直观分析和评价高维数据,提出引入随机森林(RF)方法对高维数据进行数据可视化。首先,采用RF进行有监督学习得到样本间的相似度度量,并采用主坐标分析法对其进行降维,将高维数据的关系信息变换到低维空间;然后,在低维空间中采用散点图进行可视化。在高维基因数据集上实验结果表明,基于RF有监督降维的可视化能够较好地展现高维数据的类分布规律,且优于传统的无监督降维后的可视化效果。High-dimensional data mining methods are mostly based on the mathematical theory rather than visual intuition currently. To facilitate visual analysis and evaluation of high-dimensional data, Random Forest (RF) was introduced to visualize high-dimensional data. Firstly, RF applied supervised learning to get the proximity t from the source data and the principal coordinate analysis was used for dimension reduction, which transformed the high-dimensional data relationship into the low-dimensional space. Then scattering plots were used to visualize the data in low-dimensional space. The results of experiment on high-dimensional gene datasets show that visualization with supervised dimension-reduction based on RF can illustrate perfectly discrimination of class distribution and outperforms traditional unsupervised dimension-reduction.
分 类 号:TP391.411[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145