大数据系统开发中的构件自动选型与参数配置  

Automatic Component Selection and Parameter Configuration in Development of Big Data System

在线阅读下载全文

作  者:钟雨[1] 邱明明[1] 黄向东[1] 

机构地区:[1]清华大学软件学院,北京100084

出  处:《计算机科学与探索》2016年第9期1211-1220,共10页Journal of Frontiers of Computer Science and Technology

基  金:清华大学信息科学与技术国家实验室大数据科学与技术专项~~

摘  要:大数据应用系统包含数据的采集、存储、分析、挖掘、可视化等多个技术环节,各个环节都存在多种解决方案,涉及到的各类系统有数百种之多,且系统配置较为复杂,这给企业的大数据应用系统构建带来了极大的挑战。针对大数据应用系统开发中构件选型的难题,通过建立规范化的需求指标,并采用决策树模型实现了大数据构件的自动选型。从几个主流的分布式存储系统出发,以Cassandra为例,利用多元回归拟合的方法针对硬件参数建立相应的性能模型,将用户需求作为输入,利用性能模型进行系统硬件参数配置;通过研究系统原理、架构、特点及应用场景,构建软件参数配置知识库指导软件参数的配置,从而解决了大数据系统开发中的构件自动选型和参数配置问题。Big data applications include data collection, storage, analysis, mining, visualization, and other technical aspects. Every aspect has a variety of solutions, involves several hundred application systems and the system configuration is complicated, which has brought great challenges for a company to construct big data applications. To solve the problem of component selection in the development of application system, this paper establishes standardized requirement norms and achieves automatic component selection by using the components selection decision tree. This paper embarks from the several mainstream distributed storage systems, takes Cassandra as an example, conducts experiments and uses multiple regression method to calculate the performance model for hardware parameters. Then, this paper uses the performance model to help user configure hardware parameters under the input of user's requirements.Finally, this paper studies the system's principle, structure and characteristics and constructs a knowledge base of software parameters configuration to help configure software parameters. In these ways the problem of component selection and parameter configuration in the development of big data system can be solved.

关 键 词:大数据系统 构件选型 决策树模型 参数配置 性能模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象