检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:章鸣嬛[1] 陈瑛 郭欣[1] 张璇 季萌 ZHANG Minghuan;CHEN Ying;GUO Xin;ZHANG Xuan;JI Meng(Lab of Big Data Analyses and Process,Information Science and Technology,Shanghai Sanda University,Shanghai 201209)
机构地区:[1]上海杉达学院大数据分析与处理研究中心,上海201209
出 处:《计算机与数字工程》2020年第3期617-622,共6页Computer & Digital Engineering
基 金:2016年上海市民办高校重点科研项目(编号:2016-SHNGE-01ZD);2015年IBM大学合作部联合研究项目(编号:D-2111-15-001)资助。
摘 要:研究以SEER数据库中1990~2014年间的乳腺癌数据为研究对象,分别利用Logistic回归和神经网络两种机器学习算法进行建模,以寻找影响乳腺癌5年预后的因素。研究表明:1)肿瘤分期、肿瘤分级、肿瘤尺寸、雌激素水平、年龄分组和孕激素水平等因素对于乳腺肿瘤预后具有较大影响,与临床诊断经验相吻合。2)在此两种模型下,模型测试集上的灵敏度和特异度均介于75.4%~78.2%之间,模型的ROC曲线面积(AUC)均处于0.847~0.850之间。因此,Logistic回归和神经网络算法可有效探寻模型输入变量间的关系,构建乳腺癌患者的优化预后模型,辅助医生判断患者预后情况及治疗效果。On the basis of the breast cancer data from 1990 to 2014 in the SEER database,this study developes,models with the Logistic regression and the artificial neural network,two kinds of machine learning classification algorithms,with the aim to exploring the factors affecting the 5-year prognosis of breast cancer. The results show that:1)such factors as tumor stage,tumor grade,tumor size,estrogen level,progesterone level,age grouping have a greater impact on the prognosis of breast tumors,which is in consistence with the clinic practice. 2)from the results of two models,both the sensitivity and specificity of the model test set are between 75.4% and 78.2%,and the ROC curve areas(AUC)of the two models are between 0.847 and 0.850. Therefore,it could be certain to conclude that Logistic regression and decision tree algorithms could be used to effectively explore the relationship between model input variables,develop a prognostic model for breast cancer patients,and assist doctors to some extent to assess the prognosis and the treatment effect.
关 键 词:SEER 乳腺癌 LOGISTIC回归 神经网络 预后因素
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7