检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北京市气象信息中心,北京100089 [2]北京市气象探测中心,北京100176
出 处:《计算机应用》2018年第1期38-43,55,共7页journal of Computer Applications
基 金:中国气象局公益性行业科研专项基金资助项目(201206031)~~
摘 要:针对现有气象自动站业务平台面临处理数据不及时、交互式响应慢、统计时效差等问题,提出了使用Spark Streaming技术和HBase解决该问题的方法,将实时计算框架和分布式数据库系统结合起来实现大规模流式数据处理。使用Flume收集自动站数据,Spark Streaming对数据进行流式处理并存储到HBase数据库中,并设计Spark框架下的自动站数据流式入库处理算法和要素极值的实时统计算法,在Cloudera平台下实现了一个高速可靠的实时采集、处理、统计的应用系统。通过对比分析和性能监测,验证了该系统具有低延迟和高吞吐量的优势,运行状况良好,负载均衡。实验结果表明,Spark Streaming用于气象自动站的实时业务处理,数据并行写入HBase、基于HBase的查询和各类要素统计均能达到毫秒级响应,完全能满足自动站数据的应用需求,有效地支撑天气预报业务。Aiming at these problems of the current data service of Automatic Weather Stations (AWS), including data processing delay, slow interactive response, and low statistical efficiency, a new method based on Spark Streaming and HBase technologies was proposed and introduced to process massive streaming AWS data by inte^ating stream computing framework and distributed database system. Flume was used for data collection, and data processing was conducted by Spark Streaming and data were stored into HBase. In framework of Spark, two algorithms, one for writing streaming AWS data into HBase database, the other for realizing real-time statistical calculation of different observed AWS meteorological elements were designed. Finally, a stable and high-efficient system for real-time acquisition, processing, and statistics of AWS data was developed on Cloudera platform. Based on comparative analysis and running monitoring, performances of the system were confirmed, including low latency, high I/O efficiency, stable running status and excellent load balance. The experimental results show that the response time of Spark Streaming-based real-time operational processing for AWS data can reach to millisecond level, which includes paralleled data writing into HBase, HBase-based data query and statistics on different meteorological elements. The system can fully meet needs of operational applications to AWS data, and provides effective support to weather forecast.
关 键 词:气象自动站 SPARK STREAMING 流计算 气象数据处理 FLUME
分 类 号:TP391.7[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117