基于文本挖掘的全球传染病风险预警地图计算机自动绘制  被引量:3

Automatic drawing of global risk-alerting map of communicable diseases based on text-mining technology

在线阅读下载全文

作  者:裘炯良 郑剑宁 吴薇 QIU Jiong-liang;ZHENG Jian-ning;WU Wei(Beilun Customs,Ningbo 315800,China;Ningbo Customs,Ningbo 315012,China)

机构地区:[1]北仑海关,浙江宁波315800 [2]宁波海关,浙江宁波315012

出  处:《中华卫生杀虫药械》2020年第3期270-273,共4页Chinese Journal of Hygienic Insecticides and Equipments

基  金:原国家质检总局科技计划项目(编号:2012B172)。

摘  要:目的探索应用文本挖掘技术开展全球传染病风险预警地理分布图的计算机自动绘制。方法采用网络文本爬取、数据清洗、关键信息挖掘、数据库整合、地图平面投射、传染病信息标记等智能化大数据处理技术,在SAS 9.4大数据挖掘软件中实现对全球传染病发生或流行的周期性评估、可视化展现。结果以近6个月为周期从互联网上爬取全球传染病信息约3 000条9.5万个字符组建非结构化数据仓库,构建48种国际主要传染病的风险评估数据库和12.2万条数据信息的国家或地区字典库。完成文本挖掘后在世界地图上以红、橙、蓝3色分别显示不同国家或地区传染病的高、中、低风险等级,并可通过鼠标悬停或触摸屏手指触摸实现任意一个国家或地区正在发生或流行的传染病病种警示。结论基于文本挖掘的大数据技术能高效处理包括文本在内的非结构数据信息,从而通过计算机的全自动运算展现警示,有效提升我国对境外传染病输入风险的防控效率。Objective To explore the automatic drawing of global risk-alerting map of communicable diseases based on text-mining technology.Methods The big data processing technology including web text crawling,data cleaning,key information mining,databases merging,planar projection of map,data automatic marking was employed for the periodical risk assessment and visualization automatically.Results Non-structured data warehouse was built including nearly 3000 records and 95 thousand characters of global infectious disease information crawling from the internet in the last six month.Risk assessment database contained 48 key infectious diseases and their results of risk analysis,while countries or areas database were consisted of 122 thousand records.Three kinds of color representing different risk-level of infectious diseases were marked on the global map after the text mining.The spectrum of diseases was displayed once mouse hovered on or finger pointed to any location of countries or areas.Conclusion The big data technology can process any non-structured data including text based on the text-mining.It can obviously improve the efficiency of the prevention and control of the imported communicable diseases by means of automatic data-mining and illustration.

关 键 词:传染病 风险预警 统计地图 文本挖掘 SAS 

分 类 号:R184.3[医药卫生—流行病学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象