检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:邵云霞[1,2] 王程 成彬[1] 韩珍珍[1] 韩月 SHAO Yun-xia;WANG Cheng;CHENG Bin;HAN Zhen-zhen;HAN Yue(Institute of Applied Mathematics,Hebei Academy of Sciences,Shijiazhuang 050081,China;Hebei Authentication Technology Engineering Research Center,Shijiazhuang 050081,China;Hebei Normal University,Shijiazhuang 050000,China)
机构地区:[1]河北省科学院应用数学研究所,河北石家庄050081 [2]河北省信息安全认证工程技术研究中心,河北石家庄050081 [3]河北师范大学,河北石家庄050000
出 处:《计算机技术与发展》2021年第4期46-51,共6页Computer Technology and Development
基 金:河北省科技计划项目(19603)。
摘 要:随着国内医保病种付费方式改革的稳步推进,疾病种类的准确规范成为医保事业中亟待解决的问题,也是新医改顺利进行的关键环节。目前存在的最大难题是医院的病种名称和疾病编码不规范,对应关系混乱。因此,提出一种算法组合的疾病种类预测模型。首先对住院病案首页数据作质量检测和清洗等预处理,然后通过过采样和加大敏感数据权重等方法生成数据集以解决病种类别不均衡和代价敏感问题;采用自然语言处理技术对数据集进行中文分词并映射到向量空间,计算文本相似度筛选病组,以SVM和Text CNN组合成病种预测模型,在不同样本量的数据集上进行模型实验并分析结果。随后采用2012年至2018年30多万份阑尾炎患者的病案首页数据进行实验,结果表明SVM适合少见的样本量小的病种模型,其有效且稳定,Text CNN适合常见的样本量较大的病种模型其精确度高。最后就该领域存在的问题和发展方向进行说明。With the steady advancement of the reform of the medical insurance payment method in China,the accurate standardization of the disease types has become an urgent problem to be solved in the medical insurance industry,which is also a key link for the smooth progress of the new medical reform.At present,the biggest difficulty is that the hospital’s disease name and disease code are not standardized,and the corresponding relationship is chaotic.Therefore,we present a disease category prediction model based on algorithms combination.First of all,the data on the first page of inpatient medical record are preprocessed by quality inspection and cleaning.Then,data sets are generated by over-sampling and increasing the weight of sensitive data to solve the problem of diseased categories imbalance and cost sensitivity.The natural language processing technology is used to segment Chinese words in the data set and map them to a vector space.The text similarity is calculated to screen the disease group.SVM and Text CNN are combined to form the disease prediction model.The model tests are carried on different sample sizes datasets and the results are analyzed.Subsequently,more than 300000 cases of appendicitis from 2012 to 2018 were used for the experiment.It is showed that SVM is suitable for rare disease models with small sample size,which is effective and stable,and Text CNN is suitable for common diseases with large sample size high accuracy.Finally,the problems and development direction in this field are explained.
关 键 词:病种付费 自然语言处理 机器学习 深度学习 疾病分类
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145