基于树模型的糖尿病分类预测研究  被引量:1

Research on the Diabetes Classification Prediction Based on Tree Model

在线阅读下载全文

作  者:汪艺璇 Wang Yixuan(School of Economics,Hebei University of Geosciences,Shijiazhuang 050030,China)

机构地区:[1]河北地质大学经济学院,石家庄050030

出  处:《黑龙江科学》2023年第16期57-61,共5页Heilongjiang Science

摘  要:糖尿病是一种发病率较高的慢性疾病,对其进行分类预测研究有助于预防及诊断。以糖尿病分类为研究对象,选取pregnancies、glucose、blood pressure、skin thickness、insulin、BMI、diabetes pedigree function、age等特征变量作为解释变量,分析了决策树C4.5算法、决策树CART、Bagging算法、随机森林算法及Adaboost算法等在糖尿病数据集上的预测精度,得到5种树模型的预测错误率分别为20.96%、17.19%、1.17%、0%、0%,证实了决策树组合模型在糖尿病分类预测中的优越性。选择Adaboost模型进行糖尿病的分类预测,探讨8个特征变量的相对重要性,发现glucose、BMI及diabetes pedigree function这3个变量的重要性较大,故在糖尿病预防及诊断中要多关注这些特征变量的情况。Diabetes is a chronic disease with a relatively high incidence.So the prediction study of diabetes classification is of great significance for the prevention and diagnosis of diabetes.Therefore,the study takes diabetes classification as the research object,selects pregnancies,glucose,blood pressure,skin thickness,insulin,BMI,diabetes pedigree function,and age,etc.as explanatory variables,and studies the prediction accuracy of a series of tree models,such as decision tree C4.5,decision CART,Bagging,random forest and Adaboost algorithm on diabetes datasets,and obtains that the prediction error rates of these 5 tree models are 20.96%,17.19%,1.17%,0%and 0%,which also empirically demonstrates the superiority of decision tree combination model in diabetes classification prediction.Adaboost model is selected for the categorical prediction of diabetes,and the relative importance of the 8 characteristic variables is explored.It is found that the importance of the three variables:glucose,BMI and diabetes pedigree function is relatively large,so more attention should be paid to these three characteristic variables in the prevention and diagnosis of diabetes.

关 键 词:糖尿病分类 C4.5 CART BAGGING 随机森林 ADABOOST 

分 类 号:C8[社会学—统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象