检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张晓博[1,2,3] 杨燕[1,2,3] 李天瑞[1,2,3] 陆凡 彭莉兰 ZHANG Xiaobo;YANG Yan;LI Tianrui;LU Fan;PENG Lilan(School of Information Science and Technology,Southwest Jiaotong University,Chengdu Sichuan 611756,China;Institute of Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China;National Engineering Laboratory of Integrated Transportation Big Data Application Technology(Southwest Jiaotong University),Chengdu Sichuan 611756,China)
机构地区:[1]西南交通大学信息科学与技术学院,成都611756 [2]西南交通大学人工智能研究院,成都611756 [3]综合交通大数据应用技术国家工程实验室(西南交通大学),成都611756
出 处:《计算机应用》2020年第10期3088-3094,共7页journal of Computer Applications
基 金:国家自然科学基金资助项目(61976247);四川省重点研发计划项目(20ZDYF2837)。
摘 要:针对多发于老龄人群的帕金森病(PD)的早期智能化诊断的问题,提出基于医疗检测文本信息数据的聚类技术来对PD进行分析预测。首先,对原始数据集进行预处理以获取有效特征信息,并通过主成分分析(PCA)方法将原始特征分别降维到8个不同维度的维度空间;然后,应用5个传统的经典聚类模型和3种不同的聚类集成方法分别对8个维度空间的数据进行聚类;最后,采用4个聚类性能指标来预测数据集中的多巴胺异常PD患者、健康体和无多巴胺缺失(SWEDD)PD患者。仿真结果显示,PCA特征维度值取30时,高斯混合模型(GMM)的聚类准确度达到89.12%;PCA特征维度值取70时,谱聚类(SC)的聚类准确度达到61.41%;PCA特征维度值取80时,元聚类算法(MCLA)的聚类准确度达到59.62%。对比实验结果表明,5种经典聚类方法中,PCA的特征维度值小于40时,高斯混合模型聚类效果最佳;3种聚类集成方法中,对于不同的特征维度,MCLA的聚类性能均表现优异,进而为PD的早期智能化辅助诊断提供了技术和理论支撑。In view of the problem of the early intelligent diagnosis for Parkinson’s Disease(PD)which occurs more common in the elderly,the clustering technologies based on medical detection text information data were proposed for the analysis and prediction of PD.Firstly,the original dataset was pre-processed to obtain effective feature information,and these features were respectively reduced to eight dimensional spaces with different dimensions by Principal Component Analysis(PCA)method.Then,five traditional classical clustering models and three different clustering ensemble methods were respectively used to cluster the data of eight dimensional spaces.Finally,four clustering performance indexes were selected to predict PD subject with dopamine deficiency as well as healthy control and Scans Without Evidence of Dopamine Deficiency(SWEDD)PD subject.The simulation results show that the clustering accuracy of Gaussian Mixture Model(GMM)reaches 89.12%when the value of PCA feature dimension is 30,the clustering accuracy of Spectral Clustering(SC)is 61.41%when the PCA feature dimension value is 70,and the clustering accuracy of Meta-CLustering Algorithm(MCLA)achieves 59.62%when the PCA feature dimension value is 80.The comparative experiments results show that GMM has the best clustering effect in the five classical clustering methods when the PCA feature dimension value is less than 40 and MCLA has the excellent clustering performance among the three clustering ensemble methods for different feature dimensions,which thereby provides the technical and theoretical supports for the early intelligent auxiliary diagnosis of PD.
关 键 词:帕金森病 医疗文本数据 主成分分析 聚类 聚类集成
分 类 号:TP391.7[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.31