应用于心脏病诊断的线性回归决策树模型  

Decision Tree Model Based on Linear Regression for Heart Disease Diagnosis

在线阅读下载全文

作  者:闵杰青[1] 李昕洁 谭强 赵娜[3] 李向娟 王剑[4] 曾敬勋 刘学承 

机构地区:[1]昆明市儿童医院,云南 昆明 [2]新竹交通大学科技管理研究所,台湾 新竹 [3]云南大学软件学院工程重点实验室,云南 昆明 [4]昆明理工大学信息工程与自动化学院,云南 昆明 [5]英国曼彻斯特大学计算机科学所,英国 曼彻斯特

出  处:《计算机科学与应用》2021年第8期2108-2116,共9页Computer Science and Application

摘  要:心脏病是一种十分常见的高发性疾病,已经成为导致人类死亡的主要因素之一。提高心脏病的医疗诊断的准确性,并对其实行更早的干预与治疗是需要关注的问题。在本文中,我们在数据预处理和模型建立前期阶段采用的是python代码实现,最终发现患病比例与性别和年龄也有着一定的联系。然后采用了SPSS对其进行分析,发现R值为0.719,属于0.5~1之间的大效应的情况,因此,模型拟合效果良好。此外,方差分析的显著性值为0,处于0~0.05的范围之内,可以说明各个参数建立的线性关系回归模型具有极显著的统计学意义,即线性关系显著。模型建立的后期阶段采用以决策树为代表的多种预测模型,最终预测准确率如下:基于信息熵的决策树模型为85.6%,基于基尼指数的决策树模型为84.2%,基于基尼指数的决策树(预剪枝)模型为86.6%。我们发现:模型的准确率均在85%左右,其中基于基尼指数的决策树(预剪枝)模型准确率最高。Heart disease is a very common high-incidence disease, which has become one of the main factors leading to human death. Improving the accuracy of medical diagnosis of heart disease and implementing earlier intervention and treatment are issues that need attention. In this article, we adopted python code in the early stage of data preprocessing and model establishment, and finally found that the disease ratio is also related to gender and age. Then SPSS was used to analyze it, and it was found that the R value was 0.719, which is a large effect between 0.5~1. Therefore, the model fitting effect is good. In addition, the significance value of the analysis of variance is 0, which is within the range of 0~0.05, which can indicate that the linear regression model established by each parameter has extremely significant statistical significance, that is, the linear relationship is significant. In the later stage of model establishment, a variety of prediction models represented by decision tree are used. The final prediction accuracy is as follows: the accuracy of the decision tree model based on information entropy is 85.6%, the accuracy of the decision tree model based on the Gini index is 84.2%, and the accuracy of the decision tree (prepruning) based on the Gini index is 86.6%. We found that the accuracy of the models is around 85%, and the decision tree (prepruning) model based on the Gini index has the highest accuracy.

关 键 词:变异数分析 线性回归 决策树 智慧医疗 

分 类 号:R28[医药卫生—中药学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象