运用决策树建立中国西南地区女性乳腺癌非遗传因素风险等级模型  被引量:6

Applying decision trees to establish risk rating model of breast cancer incidence based on non-genetic factors among Southwest China females

在线阅读下载全文

作  者:李芹[1,4] 刁莎 李卉[2] 何华 李佳圆[1] Li Qin;Diao Sha;Li Hui;He Hua;Li Jiayuan(West China School of Public Health,Sichuan University,Chengdu 610041,China;Department of Epidemiology and Health Statistics,Southwest Medical University,Luzhou 646000,Chin;Medical Department,Sichuan Maternal and Child Health Care Hospital,Sichuan Women and Children's Hospital,Chengdu 610045,China)

机构地区:[1]四川大学华西公共卫生学院,成都610041 [2]西南医科大学流行病与卫生统计学教研室,泸州646000 [3]四川省妇幼保健院四川省妇女儿童医院医务部,成都610045 [4]四川省妇幼保健院四川省妇女儿童医院医院感染管理科,成都610045

出  处:《中华肿瘤杂志》2018年第11期872-877,共6页Chinese Journal of Oncology

基  金:国家自然科学基金(81302500);成都市科技局项目(2015+HM01+00049-SF)

摘  要:目的运用决策树评估不同非遗传因素组合下乳腺癌发病的风险,构建中国西南地区女性非遗传因素乳腺癌风险等级模型。方法序贯收集2014—2015年就诊于四川大学华西医院、四川省肿瘤医院和四川省人民医院乳腺外科、经病理学诊断的原发性乳腺癌新发病例783例,按城乡、年龄±1岁1∶5匹配3 879例对照(剔除数据缺失者36例)。采用分类回归树算法构建非遗传因素乳腺癌风险等级模型。随机抽取5个测试集,进行模型效能验证。结果成功构建乳腺癌非遗传因素风险等级模型,超声乳腺影像报告和数据系统(BI-RADS)分类、绝经状态、年龄、乳腺良性病史、初潮年龄、初产年龄、活产次数为乳腺癌的风险因素,其中BI-RADS分类、绝经状态、年龄是影响乳腺癌发病最重要的3个因素。5个测试集评价决策树分类能力的平均灵敏度、阳性预测值和准确性分别为95.60%、92.26%和97.93%。结论采用决策树构建的非遗传因素乳腺癌风险等级模型有效且可靠,能评估不同非遗传因素组合下乳腺癌发病的相对风险概率,可作为中国西南地区女性乳腺癌风险人群划分的基础工具。ObjectiveTo estimate incident probability and establish risk rating model of breast cancer incidence under different combinations of non-genetic factors among Southwest China females, applying the decision trees. MethodsFrom 2014 to 2015, a total of 783 cases, which were pathologically diagnosed as primary breast cancer, were sequentially collected from West China Hospital of Sichuan University, Sichuan Cancer Hospital and Sichuan Province People′s Hospital. 3, 879(excluding 36 samples with missing data) controls were randomly selected and matched by area of residence and age. Classification and regression tree (CART) algorithm was applied to construct breast cancer risk rating model according to non-genetic factors. 5 test sets were randomly selected for model validation. ResultsBI-RADS classes, menopausal status, age, history of benign breast disease, menarche age, age of first delivery and number of live births were identified as risk factors and included in the risk rating model of breast cancer incidence. Among these factors, BI-RADS classes, menopausal status and age were the most important. The risk rating model developed were vitrificated by 5 test sets, and the average sensitivity, positive predictive value, accuracy were 95.60%, 92.26%, 97.93%, respectively. ConclusionsBreast cancer risk rating model constructed by decision trees was valid and reliable. The model could be used as the basic tool of breast cancer risk assessment among Southwest China females.

关 键 词:乳腺肿瘤 决策树 模型  统计学 

分 类 号:R737.9[医药卫生—肿瘤]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象