Characterizing diseases using genetic and clinical variables:A data analytics approach  

在线阅读下载全文

作  者:Madhuri Gollapalli Harsh Anand Satish Mahadevan Srinivasan 

机构地区:[1]Engineering Department,Penn State Great Valley,Malvern,Pennsylvania,USA [2]Department of Systems and Information Engineering,School of Engineering and Applied Science,University of Virginia,Charlottesville,Virginia,USA

出  处:《Quantitative Biology》2024年第3期271-285,共15页定量生物学(英文版)

摘  要:Predictive analytics is crucial in precision medicine for personalized patient care.To aid in precision medicine,this study identifies a subset of genetic and clinical variables that can serve as predictors for classifying diseased tissues/disease types.To achieve this,experiments were performed on diseased tissues obtained from the L1000 dataset to assess differences in the functionality and predictive capabilities of genetic and clinical variables.In this study,the k-means technique was used for clustering the diseased tissue types,and the multinomial logistic regression(MLR)technique was applied for classifying the diseased tissue types.Dimensionality reduction techniques including principal component analysis and Boruta are used extensively to reduce the dimensionality of genetic and clinical variables.The results showed that landmark genes performed slightly better in clustering diseased tissue types compared to any random set of 978 non-landmark genes,and the difference is statistically significant.Furthermore,it was evident that both clinical and genetic variables were important in predicting the diseased tissue types.The top three clinical predictors for predicting diseased tissue types were identified as morphology,gender,and age of diagnosis.Additionally,this study explored the possibility of using the latent representations of the clusters of landmark and non-landmark genes as predictors for an MLR classifier.The classification models built using MLR revealed that landmark genes can serve as a subset of genetic variables and/or as a proxy for clinical variables.This study concludes that combining predictive analytics with dimensionality reduction effectively identifies key predictors in precision medicine,enhancing diagnostic accuracy.

关 键 词:CLUSTERING K-MEANS L1000 dataset analysis landmark genes multinomial logistic regression non-landmark genes principal component analysis tissue classification 

分 类 号:R318[医药卫生—生物医学工程] TP311.13[医药卫生—基础医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象