检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王华珍[1] 彭淑娟[1] 缑锦[1] 陈锻生[1]
机构地区:[1]华侨大学计算机科学与技术学院,厦门361021
出 处:《系统仿真学报》2014年第11期2751-2756,共6页Journal of System Simulation
基 金:国家自然科学基金项目(61202298);福建省自然科学基金(2012J01274);华侨大学高层次人才科研项目(09BS515);厦门市科技计划项目(3502Z20123032)
摘 要:中医诊疗研究引入机器学习方法存在交互性差和特征值离散性两大缺陷。引入基于随机森林(Random Forest,RF)的可视化技术,对原始数据进行基于RF的特征变换,使样本在新特征空间的类可分性增强;采用主坐标分析法对变换后的数据进行降维,将高维数据的关系信息变换到适合人类视觉认知的低维空间里;在低维空间里采用散点图和平行坐标图对数据进行可视化。在中医慢性胃炎数据集上的实验结果表明,通过RF处理后,各类数据聚集在不同的区域空间中,呈现良好的可分性。这些图形图像视觉信息能帮助用户准确理解数据集的分布规律以及隐含的发展趋势,进而深入探讨这些信息蕴含的中医学意义。Machine learning methods involved in Traditional Chinese Medicine (TCM) diagnosis suffered from poor interactivity and discrete features. A data visualization method based on Random Forest (RF) was proposed. The RF was used for feature transformation, which enhanced the good separation of the dataset in new feature space. The principal coordinate analysis was applied for dimension reduction in the new feature space, converting the information of high-dimensional data into low-dimensional space to fit the ability of human visual perception. Two methods of data visualization, including scattering and parallel coordinates, were applied to plot visually with a little of dimensions. The experimental results on chronic gastritis dataset show, owing to the reprocess of RF, each class examples of the dataset are casted into different local space and shows more structured distribution. All the visualization information based on graphic figures helps users figure out the implicit distribution and the changing trends of the dataset, which then facilitates the further study of the corresponding significance as for TCM.
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145