基于SMOTE-ENN结合改进动态集成选择算法构建DLBCL患者2年内复发预测模型  

Recurrence Prediction Model of DLBCL Patients within 2 Years based on SMOTE-ENN Combined with Improved Dynamic Ensemble Selection Algorithm

在线阅读下载全文

作  者:张高源 赵瑞青 张岩波[1,2,3] 余红梅 周洁[4] 乔宇 王俊霞[1] 王雪嫚 于凯 郭玉娇 赵志强 罗艳虹 Zhang Gaoyuan;Zhao Ruiqing;Zhang Yanbo(Department of Health Statistic,School of Public Health,Shanxi Medical University,Taiyuan 030001)

机构地区:[1]山西医科大学公共卫生学院卫生统计教研室,030001 [2]重大疾病风险评估山西省重点实验室 [3]煤炭环境致病与防治教育部重点实验室 [4]山西省肿瘤医院核医学PET/CT中心 [5]山西省肿瘤医院血液科

出  处:《中国卫生统计》2025年第1期50-55,61,共7页Chinese Journal of Health Statistics

基  金:山西省科技厅应用基础研究计划面上项目(202103021224245);2024年山西省高等学校教学改革创新项目(J20240531);山西省2024年度研究生教育创新计划项目(2024JG088);国家自然科学基金青年科学基金(81502897,82273742,82173631);山西医科大学博士启动基金(BS2017029)。

摘  要:目的构建基于FIRE动态集成选择(frienemy indecision region dynamic ensemble selection,FIRE-DES)的弥漫大B细胞淋巴瘤(diffuse large B-cell lymphoma,DLBCL)患者治疗达到完全缓解后两年内复发情况的预测模型,为患者的治疗提供决策依据。方法收集山西省某三甲医院2010年1月至2020年1月经治疗后达到完全缓解的498名患者信息,构建基于四种常见类别不平衡处理方法的FIRE-DES复发预测模型,并与传统的五种单一分类器与两种集成分类器进行比较。结果四种类别不平衡算法中SMOTE-ENN(synthetic minority oversampling technique and edited nearest neighbor)算法取得了最优分类性能,在此基础上采用DESP(dynamic ensemble selection performance)、KNORAU(K-nearest oracle union)和META-DES(meta-learning for dynamic ensemble selection)动态集成选择算法的分类效果明显优于传统的单一分类器以及集成分类器模型,基于FIRE改进的DESP、KNORAU和META-DES动态选择算法的分类效果在其基础上实现了进一步提升,且FIRE-META-DES取得了最优的分类性能(准确率=0.909,精确率=0.906,召回率=0.967,ROC曲线下面积=0.879,F1-Score=0.936,Brier Score=0.088)。结论针对DLBCL实际数据集,本文SMOTE-ENN+FIRE-META-DES的复发预测模型在性能上达到最优,可为DLBCL复发预测提供有力参考。Objective The prediction model of recurrence within two years after complete remission of diffuse large B-cell lymphoma(DLBCL)patients was constructed based on frienemy indecision region dynamic ensemble selection(FIRE-DES)to provide decision-making basis for the treatment of patients.Methods To collect data of 498 patients who achieved complete response after treatment from January 2010 to January 2020 in a Grade-A hospital in Shanxi Province.A FIRE-DES combination prediction model based on four common category-disequilibrium treatment methods was constructed and compared with five traditional single classifiers and two integrated classifiers.Results Among the four categories of unbalance algorithms,synthetic minority oversampling technique and edited nearest neighbor(SMOTE-ENN)algorithm has obtained the optimal classification performance.On this basis,the classification effect of dynamic ensemble selection performance(DESP),K-nearest oracle union(KNORAU)and meta-learning for dynamic ensemble selection(META-DES)dynamic integration selection algorithms is obviously superior to the traditional single classifier and ensemble classifier model.The classification effect of the improved DESP,KNORAU and META-DES dynamic selection algorithms based on Frienemy Indecision Region is further improved.The classification performance of FIRE-META-DES was the best(Accuracy=0.909,Precision=0.906,Recall=0.967,AUC=0.879,F1-Score=0.936,Brier Score=0.088).Conclusion Aiming at the actual DLBCL data set,SMOTE-ENN+FIRE-META-DES combined prediction model for recurrence used in this paper achieves the optimal performance and low computational complexity,which can provide a strong reference for DLBCL recurrence prediction.

关 键 词:弥漫大B细胞淋巴瘤 复发预测 类别不平衡 动态集成选择 

分 类 号:R195.1[医药卫生—卫生统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象