检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:祝磊[1] 曹凯敏[1] 游晓璐[1] 徐平[1] 应南娇[1]
机构地区:[1]杭州电子科技大学生命信息与仪器工程学院,浙江杭州310018
出 处:《航天医学与医学工程》2014年第5期367-372,共6页Space Medicine & Medical Engineering
基 金:国家自然科学基金(60801054;61205200);浙江省自然科学基金(LY12F01005)
摘 要:目的针对高维冗余的SELDI蛋白质质谱数据,提出一种基于聚类分析和半监督学习的数据分类方法。方法算法首先运用t-test对蛋白质质谱数据进行初步降维;然后将处理后的数据用聚类分析算法进行进一步降维;最后运用半监督学习算法传递标签,充分提取有标记样本和无标记样本的信息,从而进行分类。结果在公共卵巢癌数据集OC-WCX2b和公共前列腺癌数据集PC-H4上获得了99.15%和96.75%分类准确率。在浙江省肿瘤医院临床乳腺癌数据集BC-WCX2a上获得了95.18%的分类准确率和100%的敏感性。结论基于聚类分析的半监督学习方法能够有效利用未标记的质谱样本信息,与经典的监督学习算法相比,其分类性能更理想、实用性更好。Objective To propose a classification method based on affinity propagation clustering and semi-supervised learning for the high-dimensional and redundant mass spectrometry data. Methods First,t-test was applied to extract part of component of the proteomic mass spectrometry data preliminarily. Then,the affinity propagation clustering was employed to extract the principal component. Finally,to take advantage of both labeled samples and unlabeled samples,semi-supervised learning was used to predict the labels. Results The classification accuracy of the algorithm proved to be 99. 15% and 96. 75% respectively in the public ovarian cancer database OC-WCX2 b and the public prostate cancer database PC-H4. In the clinical breast cancer database BC-WCX2 a of Zhejiang Cancer Hospital,the classification accuracy was 95. 18% and the sensitivity was 100%. Conclusion The experimental results demonstrate that the method of classification based on affinity propagation clustering and semi-supervised learning can effectively make use of the information from unlabeled mass spectrometry samples. Compared with the supervised learning method,it proves to be a more ideal method of classification and more practical.
分 类 号:R318.04[医药卫生—生物医学工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.236