检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]大规模流数据集成与分析技术北京市重点实验室,北京100144 [2]北方工业大学数据工程研究院,北京100144
出 处:《计算机工程与科学》2017年第4期641-647,共7页Computer Engineering & Science
基 金:北京市自然科学基金(4131001)
摘 要:物联网感知流数据多以时序数据为主,具有数据量大、连续到达、多来源等特点。现有的基于HBase的交通流数据存储系统在数据写入并发量大时,仍然存在存储效率低与系统可用性不高的问题。针对该问题,设计并实现了基于负载均衡的多源流数据实时存储系统。该系统将数据代理扩展为集群架构,提出了一种基于负载均衡的任务调度算法,实现了任务与数据代理之间的按序匹配,使数据代理集群负载均衡地处理任务,实现数据并行存储到HBase数据库中。实验对比结果表明:该系统使各数据代理的数据分配比例维持在0.3~0.4,同时以约1.5倍于单数据代理的速度将数据写入HBase数据库。The perceptual streaming data of the Internet of things is mainly centered on timeseries data, and has the characteristics of a large amount of data, continuous arrival, and multiple sources and so on. When data is written in a large amount of concurrency, the existing traffic streaming data storage system based on HBase still has the problems of storage efficiency and system availability. To solve the problems, we design and implement a multisource streaming data realtime storage system based on load balance. The system expands the data proxy into a cluster architecture, presents a task scheduling algorithm based on load balance, and achieves the sequence matching between tasks and data proxy servers, thus making the data proxy cluster processing tasks in a balanced manner and achieving data storage in parallel in the HBase database. Experimental results show that the system maintains the data distribution ratio of each data agent between 0.3 and 0.4, and writes data to the HBase database at about 1.5 times the speed of the single data proxy.
关 键 词:多源流数据 HBASE 实时存储系统 数据代理 负载均衡 任务调度
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15