检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:汪艺璇 Wang Yixuan(School of Economics,Hebei University of Geosciences,Shijiazhuang 050030,China)
出 处:《黑龙江科学》2023年第16期57-61,共5页Heilongjiang Science
摘 要:糖尿病是一种发病率较高的慢性疾病,对其进行分类预测研究有助于预防及诊断。以糖尿病分类为研究对象,选取pregnancies、glucose、blood pressure、skin thickness、insulin、BMI、diabetes pedigree function、age等特征变量作为解释变量,分析了决策树C4.5算法、决策树CART、Bagging算法、随机森林算法及Adaboost算法等在糖尿病数据集上的预测精度,得到5种树模型的预测错误率分别为20.96%、17.19%、1.17%、0%、0%,证实了决策树组合模型在糖尿病分类预测中的优越性。选择Adaboost模型进行糖尿病的分类预测,探讨8个特征变量的相对重要性,发现glucose、BMI及diabetes pedigree function这3个变量的重要性较大,故在糖尿病预防及诊断中要多关注这些特征变量的情况。Diabetes is a chronic disease with a relatively high incidence.So the prediction study of diabetes classification is of great significance for the prevention and diagnosis of diabetes.Therefore,the study takes diabetes classification as the research object,selects pregnancies,glucose,blood pressure,skin thickness,insulin,BMI,diabetes pedigree function,and age,etc.as explanatory variables,and studies the prediction accuracy of a series of tree models,such as decision tree C4.5,decision CART,Bagging,random forest and Adaboost algorithm on diabetes datasets,and obtains that the prediction error rates of these 5 tree models are 20.96%,17.19%,1.17%,0%and 0%,which also empirically demonstrates the superiority of decision tree combination model in diabetes classification prediction.Adaboost model is selected for the categorical prediction of diabetes,and the relative importance of the 8 characteristic variables is explored.It is found that the importance of the three variables:glucose,BMI and diabetes pedigree function is relatively large,so more attention should be paid to these three characteristic variables in the prevention and diagnosis of diabetes.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249