检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:卞伟玮 王永超[2,3] 崔立真[2,4] 郭伟[2,4] 李晖[2,4] 周苗[1,2] 薛付忠[1,2] 刘静[1,2]
机构地区:[1]山东大学公共卫生学院生物统计学系,山东济南250012 [2]山东大学齐鲁生物医学大数据研究中心,山东济南250012 [3]康评健康医疗大数据科技有限公司,山东济南250101 [4]山东大学计算机科学与技术学院,山东济南250101
出 处:《山东大学学报(医学版)》2017年第6期47-55,共9页Journal of Shandong University:Health Sciences
基 金:国家自然科学基金(81273177)
摘 要:目的快速、准确地获得公共卫生服务系统的医疗数据,并进行数据整理,为建立人群健康风险评估模型提供数据基础。方法运用聚焦网络爬虫技术,设计算法并编程,在自动记录和修正URL异常、原始数据存档、保持登录方式3个方面进行算法改进。将设计好的爬虫应用于爬取已获得授权网站的医疗数据,通过医学数据库系统,对数据进行解析、整理与导出。结果获得多个公共卫生服务基地数据,为当地政府部门提供数据分析报告,利用整理分析的数据完成多项健康风险评估模型建立。结论基于网络爬虫技术建立的数据采集整理系统,可以解决获取及整理网络许可数据的难题,将此技术应用于医药卫生领域,可使现有丰富的医学数据资源得以充分利用并提高利用效率。Objective To collect and process the medical data from public health service system rapidly and exactly, and to provide data base for establishing the population health risk assessment model. Methods The algorithm and pro- gram were based on focused web crawler. This study mainly improved the algorithm in three aspects: automatic record- ing and correcting URL anomaly, original data archiving and keeping login mode. Medical data of the authorized web- site were obtained by the advanced web crawler, and were parsed and sorted out via medical database system. Results Data from several public health service base were acquired to provide data analysis report for local government, and multiple health risk assessment models were constructed by means of the processed data. Conclusion Utilizing the data collecting and processing system based on web crawler, we can deal with the problem that acquiring and organizing the available data in real life. This technology can be applied in medicine and health field, which will make full use of the existing rich medical data resources and greatly improve the utilization efficiency.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38