基于电子病历的冠心病亚型危险因素分层抽取  被引量:1

Stratified extraction of risk factors for the subtype of coronary heart disease based on electronic medical record

在线阅读下载全文

作  者:余光雷 张琳琳[2] 张颖[3] 李昕遥 毕雪华[1] Yu Guanglei;Zhang Linlin;Zhang Ying;Li Xinyao;Bi Xuehua(School of Medical Engineering and Technology,Xinjiang Medical University,Urumqi 830017,Xinjiang Uygur Autonomous Region,China;College of Information Science and Engineering,Xinjiang University;he First Affiliated Hospital of Xinjiang Medical University)

机构地区:[1]新疆医科大学医学工程技术学院,乌鲁木齐830017 [2]新疆大学信息科学与工程学院 [3]新疆医科大学第一附属医院

出  处:《中国数字医学》2022年第3期39-44,共6页China Digital Medicine

基  金:新疆维吾尔自治区自然科学基金项目(2019D01C205,2019D01C041)。

摘  要:目的:分析电子病历中的患者特征,实现冠心病患者亚型下危险因素的分层抽取。方法:提出改进的Labeled LDA模型,以多项分布生成冠心病亚型类别标签,然后从亚型类别标签生成危险因素分层这一隐含主题,构建冠心病亚型—患者—危险因素分层—患者特征的4层结构主题模型。该模型通过建立亚型类别标签与危险因素分层之间的映射关系,首先对冠心病亚型进行多分类预测,然后实现患者危险因素在不同亚型下的分层自动抽取。结果:使用真实临床环境采集的电子病历数据进行验证,准确率达到了83.23%,Macro-F1值达到了82.31%。结论:实验结果表明,通过约束患者亚型类别与危险因素分层隐含主题之间的映射,改进的Labele LDA模型具有较高的模型可解释性,且准确率均高于逻辑回归、支持向量机、随机森林和LightGBM等4种对比模型。Objective To analyze the characteristics of patients in electronic medical record and realize stratified extraction of risk factors with different subtype of coronary heart disease(CHD)patients.Methods The improved Labeled LDA model was proposed to generate subtypes of CHD with multinomial distribution,and then the implicit theme of risk factor stratification was generated from the category label of subtype.The topic model with the structure of 4 levels including subtype of CHD,patients,stratification of risk factors,and patient characteristics was built.By establishing the mapping relationship between the subtype category labels and risk factors stratification,first,the multiclassification prediction was made on the subtypes of CHD,and then the automatic extraction of stratification of risk factors of patients under different subtypes was realized.Results The accuracy was 83.23%and the Macro-F1 value was 82.31%,which was verified by using the electronic medical record data collected in real clinical environment.Conclusion The improved Labeled LDA model in this study has a high interpretability by constraining the mapping relationship between subtypes of the patient and risk factor stratification,and the accuracy is higher than the four comparison models including logistic regression,support vector machine,random forest and LightGBM.

关 键 词:Labeled LDA 分层抽取 危险因素 冠心病 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] R319[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象