决策树技术在农村3岁以下儿童贫血状况研究中的应用  被引量:5

The application of decision tree in the research of anemia among rural children under 3-year-old

在线阅读下载全文

作  者:马玉刚[1] 毕育学[1] 颜虹[1] 邓立娜[1] 梁卫峰[1] 王蓓[1] 张雪丽[1] 

机构地区:[1]西安交通大学医学院公共卫生系卫生统计学教研室,710061

出  处:《中华预防医学杂志》2009年第5期434-437,共4页Chinese Journal of Preventive Medicine

基  金:卫生部与联合国儿童基金会资助项目(YH001);国家自然科学基金(30771866)

摘  要:目的探讨决策树技术在农村儿童贫血研究中的应用。方法在SAS8.2软件的Enterprise Miner模块中,将3000例农村地区3岁以下断奶儿童的卫生保健研究数据按75%和25%分为初步拟合模型的训练集与调整模型的验证集,利用Gini杂质函数建立CART算法决策树模型,以误分率、ROC曲线、Root ASE和诊断图建立的模型进行评价。通过模型中的变量以及变量在模型中的上下层级关系,来分析农村地区3岁以下断奶儿童贫血发生的影响因素,以及影响因素间的相互作用。结果CART决策树模型中训练集和验证集的误分率分别为21.2%、21.9%,RootASE为0.399、0.404;模型的ROC曲线高于参考线,有较大的曲线下面积;诊断图中实际值和预测值相一致的比例最大,正确分类的观察符合率明显高于错误分类的观察符合率;决策树模型共筛选出9个影响儿童贫血的重要因素,并按影响因素间的相对重要性进行了排序,其中母亲是否贫血(1.00)是最重要的影响因素,其他的是儿童的月龄(0.75)、儿童的断奶时间(0.53)、孩子母亲的年龄(0.32)、添加鸡蛋的时间(0.26)、项目县分类(0.26)、添加鲜奶的时间(0.16)、家庭人口数(0.13)和母亲受教育年限(0.12)。结论决策树技术为有效分析儿童保健研究方面的资料提供一种新的思路。Objective To study the application of decision tree in the research of anemia among rural children. Methods In the Enterprise Miner module of software SAS 8. 2,3000 observations were sampled from database and the decision tree model was built. The model using decision tree of CART bases on Gini impurity index. Results The misclassification rate of decision tree model was, training set 21.2% , validation set 21.9%. The Root ASE of decision tree model was, training set 0. 399, validation set 0. 404. The area under the ROC curve was larger than the reference line. The diagnostic chart showed that the corresponding percentage was higher than the other. The decision tree model selected 9 important factors and ranked them by their power, among which mother of anemia ( 1.00 ) was the most important factor. Others were children's age (0.75), time of ablactation(0. 53 ), mother's age( 0. 32 ), the time of egg supplementation (0. 26), category of the project county(0.26), the time of milk supplementation (0. 16), number of people in the family (0. 13) ,the education status of the mother (0. 12). Decision tree produced simple and easy rules that might be used to classify and predict in the same research. Conclusion Decision tree could screen out the important factors of anemia and identify the cutting-points for factors. With the wide application of decision tree, it would exhibit important application values in the research of the rural children health care.

关 键 词:决策树 贫血 儿童 误分率 

分 类 号:R686[医药卫生—骨科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象