检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]三峡大学计算机与信息学院,湖北宜昌443002
出 处:《重庆理工大学学报(自然科学)》2015年第7期69-73,共5页Journal of Chongqing University of Technology:Natural Science
基 金:湖北省教育厅自然科学研究项目(Q20141212)
摘 要:针对目前基于Hadoop的数据仓库一般采用"一对一"的模式或方法构建的情况,首先通过实例分析其不足之处;然后借鉴软件工程中的"生成器"设计模式的思想,提出一种Hadoop数据仓库的构建模式,称为"元数据驱动的生成器模式",用于构建基于Hadoop的数据仓库,即ETL过程。该模式具有两点优势:一是由元数据驱动,充分发挥了关系数据库管理系统对元数据操作的效率优势;二是识别了"通用知识"和"具体对象知识"两类知识,并在对知识的分类基础上设计和实现ETL过程,消除了"一对一"模式下大量不必要的重复操作。The "case to case" pattern is a commonly used method for constructing Hadoop Hive data warehouse recently. Firstly, the "case to case" pattern was introduced and its disadvantage was shown with an example. Then inspired by the "Builder Pattern" which is one of design patterns in the area of software engineering, a pattern called "metadata-driven builder pattern" was put forward for constructing Hadoop Hive data warehouse and ETL process. This pattern has two advantages. One is that it is driven by the metadata and the metadata is operated by the relational database management (RDBMS). Doing so can achieve higher efficiency because the metadata of Hive is just stored in the RDBMS. The other one is that the "general knowledge" and "specific-object knowledge" are differentiated and the ETL process is designed and realized based on such differentiation. Doing so can avoid lots of repetitions that the "case to case" pattern leads to.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.43