检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李谊澄 侯锐志 邹宗毓 周子君[1] Li Yicheng(Department of Health Policy and Management,School of Public Health,Peking University,Beijing,100191)
机构地区:[1]北京大学公共卫生学院卫生政策与管理学系,北京100191 [2]华东师范大学数学科学学院基础数学系,上海200062
出 处:《医学与社会》2020年第8期78-83,共6页Medicine and Society
摘 要:目的:以国际疾病分类(ICD)为基础标准疾病术语模板,通过扩展ICD-10分类内容和机器学习的方法对多样化和不规范的疾病诊断名称进行识别,并转化为统一的标准疾病诊断术语,以提高门诊疾病诊断名称的计算机识别和分类效率。方法:将标准的疾病诊断术语作为训练集,采用基于“规则+贝叶斯机器学习+惩罚得分机制”的模型进行训练;测试数据为2018年上半年北京市22家三甲医院门诊诊断数据,对其中的783364条非标准化疾病诊断名称进行预处理后得到220258条疾病诊断,然后采用随机抽样的方法从中抽取5个样本作为测试集,每个样本含1000条疾病诊断术语,最后对训练集的标准化结果进行逐一甄别,分别计算准确率、召回率和综合评价指标。结果:应用基于“规则+贝叶斯机器学习+惩罚得分机制”的模型进行疾病诊断名称识别的平均准确率、召回率和综合评价指标分别为95.00%、92.65%和93.79%。模型有效解决了字形相近、多样性、不易分类的疾病诊断名称规范化的问题。结论:在对不规范门诊疾病诊断名称进行人工智能识别时,基于“规则+贝叶斯机器学习+惩罚得分机制”的模型拥有较好的识别效果,能够有效将非标准化的疾病诊断名称匹配到标准的ICD-10诊断术语,并与相应的ICD-10编码产生映射关系。Objective:Based on the International Disease Classification(ICD)as the standard disease term template,through the expansion of the ICD-10 classification content and machine learning methods to identify diversified and non-standard disease diagnosis names,and converted into a unified standard disease diagnosis term,to improve the efficiency of computer identification and classification of outpatient diagnosis.Methods:Using standard disease diagnosis as a training set,using a Regularization+Bayesian machine learning+Penalty scoring mechanism model to train.The test data was the diagnostic data of 22 grade-A tertiary hospitals in Beijing outpatient clinic in first half of 2018.220258 disease diagnosis were produced after the pretreatment of 783364 non-standardized diseases names.Then,five samples were taken from them by the random sampling method as test sets,and each sample contained 1000 disease diagnoses.Finally,the standardized results of the training sets were screened one by one,and calculated the recall rate,accuracy rate and comprehensive evaluation index separately.Results:Based on the Regularization+Bayesian machine learning+Penalty scoring mechanism model,the accuracy rate,recall rate and comprehensive evaluation index of disease diagnosis names standardization were 95.00%,92.65%and 93.79%,respectively.The model effectively solved the problem of standardization of disease diagnosis names with similar fonts,diversity,and difficult to classificate.Conclusion:In artificial intelligence identification of non-standard outpatient diagnosis,the Regularization+Bayesian machine learning+Penalty scoring mechanism model has a better classification effect,which can standardize non-standardized disease diagnosis names effectively and generate a mapping relationship with the corresponding ICD-10 code.
关 键 词:疾病诊断名称 规范化 机器学习 贝叶斯模型 惩罚机制 北京
分 类 号:R197.1[医药卫生—卫生事业管理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171