检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄振华 王振宇[1] 江莉 张睿 雷昶 刘星炜 谢晓辉 Zhenhua HUANG;Zhenyu WANG;Li JIANG;Rui ZHANG;Chang LEI;Xingwei LIU;Xiaohui XIE(School of Softuare Engineering,South China University of Technology,Guangzhou 510006,China;Department of Information and Computer Science,University of California Irvine,Irvine 92617,USA;Department of Pharmacg,The First Afliated Hospital of Anhui Medical University,Hefei 230032,China)
机构地区:[1]华南理工大学软件学院,广州510006 [2]Department of Information and Computer Science,University of California Irvine,Irvine 92617,USA [3]安徽医科大学第一附属医院药剂科,合肥230032
出 处:《中国科学:信息科学》2020年第12期1882-1902,共21页Scientia Sinica(Informationis)
基 金:国家自然科学基金面上项目(批准号:61876207);广东省重点领域研发计划项目(批准号:2019B010154004);广东省基础与应用基础研究基金项目(批准号:2019A1515011792);广州市产业技术重大攻关计划项目(批准号:201802010025);广州市高校创新创业平台建设项目重点项目(批准号:2019PT103)资助。
摘 要:2020年年初,新型冠状病毒感染的肺炎(COVID-19)爆发,中国采取了全面严格的防控举措全力抗击疫情.地方疫情指挥部门及时通报疫情感染数据,有助公众了解疫情的发展,及时做好防护措施.各地患者病例详情数据主要以文本形式记录,信息描述复杂,且各省市汇报的格式各异,处理难度较大.我们面向全国湖北省外近二分之一匿名的患者病例详情数据,提出了应用自然语言处理技术,辅助病例数据结构化的方法.该方法可以在标记样本较少的情况下,借助预训练模型,准确有效地提取出病例文本中的关键信息.通过对较大规模患者结构化病例数据的挖掘,本文详细分析了新型冠状肺炎总体发病性别和年龄分布特点、主要感染原因、潜伏期特点及疫情趋势等特征.由于潜伏期等时间延迟的存在,确诊人数往往不能反映一个地区的真实感染情况,结合出行大数据,本文提出了一个合理推断武汉市等城市实际感染人数的方法.该方法有助于人们提前估计地区疫情发展情况,及早采取防护措施.也可以辅助地方相关部门科学决策,尽早调度医务人员和分配医疗资源.In early 2020,the novel coronavirus,referred to as COVID-19 burst out.The Chinese people tookthe most comprehensive and rigorous control measures to fight against the COVID-19.Local health controldepartments reported infection data in a timely manner,which helped the public understand the developmentof the epidemic and take protective measures in advance.However,currently,no literature has analyzed thetransmission characteristics of COVID-19 based on the structured data of large-scale patient cases and artificialintelligence.The detailed case data of patients in various regions are primarily recorded in text form,and theformats of report data in different provinces and cities differ,which makes it difficult to handle such data.Toanalysis around a large anonymous patient case data,we propose a method based on natural language processingtechnology to structure the case data.The proposed method can extract key information in the cases accuratelyand effectively with the help a pretrained model and a small number of labeled samples.By mining the patient'sstructured case data,we analyze the gender and age distribution,the main causes of infection,the characteristicsof the incubation period,and epidemic trends in detail.Using big data on travel,a method was developed toestimate the number of infected individuals in Wuhan prior the restrictions were put into effect.This methodhelps people understand the real epidemic situation and take execute early protective measures.It is also helpsgovernment departments make evidence-based decisions,dispatch medical staff,and allocate medical resources asquickly as possible.
关 键 词:―新型冠状病毒 结构化病例 自然语言处理 预训练模型 COVID-19传播特征 出行大数据
分 类 号:R181.3[医药卫生—流行病学] R563.1[医药卫生—公共卫生与预防医学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7