检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王建荣[1] 华连生[1] 唐怀瓯[1] 王云[1] 王静[1]
出 处:《计算机技术与发展》2018年第2期167-172,共6页Computer Technology and Development
基 金:中国气象局关键技术集成项目(CMAGJ2015M29);安徽省气象局科技发展基金项目(KM201604)
摘 要:气象数值预报产品数据日益增长,传统的关系型数据库对其存储和管理能力不足,查询规模较大的历史数据时效率较低。针对上述问题,设计了分布式的数值预报产品处理与存储系统。通过Quartz任务调度定时采集数值预报产品文件;运用Kafka分布式消息队列解耦数值预报产品解码与入库程序;将解码日志文件、原始产品文件和解码得到的要素GRIB文件写入HDFS分布式文件系统,应用MapReduce分布式程序将解码日志记录存入HBase。因HBase对Rowkey的一级索引支持较好,而对多条件查询支持不足,需辅助Solr索引加以优化。HBase接收数据时自动触发协处理器同步记录到Solr索引库,实现了HBase的二级索引。测试结果表明,产品文件写入Hadoop文件系统平均速度为82.54 MB/s,而HBase最快入库速度可达每秒13 677条,数据检索结果返回时效达到毫秒级,能够满足业务应用中对数值预报产品存储和检索时效的要求。With the rapid growth of global and regional numerical weather prediction (NWP) products,traditional relational database has insufficient storage and management for the mass data and its query efficiency is low in long-time-series data accessing. Therefore,we designa distributed data processing and storage system. The system copies NWP files from source folders by using the Kafka Quartz scheduler anddecouples NWP products decoding and storage programs by using Kafka distributed message queue. It also writes the decoding log files,source products and element GRIB files into HDFS and then inserts the decoding log file records into HBase. Because the HBase has bettersupport for the first level index of Rowkey,but it is not enough to support the multi condition query,it is necessary to optimize the query using Solr index. HBase receives the data meanwhile it automatically triggers the coprocessor to write records synchronously to SolrCloud,which realizes the multi condition index in HBase. The test shows that the average speed of product file to Hadoop file system is 82. 54 MBper second,fastest storage speed can be up to 13 677 records per second,and the response time of data retrieval is up to millisecond level,thus it can meet the performance requirement of the storage and retrieval time of NWP data in business applications.
关 键 词:QUARTZ 解码日志文件 Kafka HBASE SOLR 协处理器
分 类 号:TP302[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:52.14.150.165