基于示例编程的层次模型到关系模型的数据转换  

Data Transformation from Hierarchical Model to Relational Model Based on Example Programming

在线阅读下载全文

作  者:周晓楠 李贵[1] 李征宇[1] 

机构地区:[1]沈阳建筑大学,辽宁 沈阳

出  处:《数据挖掘》2022年第4期334-350,共17页Hans Journal of Data Mining

摘  要:将多个数据源中的数据结合起来并统一存储,建立数据仓库的过程是web数据集成中的一个重要步骤。数据集成通过数据转换从而达到集成,主要解决数据的分布性和异构性的问题。许多应用程序使用层次结构存储和传输数据,这种基于树结构的层次模型非常适合底层数据,因此分层数据格式很流行用于导出数据并在不同应用程序之间传输数据。为了便于存储和查询通常需要将此类层次结构数据转换为关系表示,但由于层次结构数据和关系结构数据的特点以及需要处理的数据源可能很大,给这一转换过程带来了不少的工作量。为了解决这个问题,本文采用了一种基于示例编程的方法,用于将层次结构的文档迁移到关系格式。通过提出一种程序合成算法将合成关系表的任务分解为列提取和行提取这两个子任务,从输入输出示例学习目标转换,实现XML文档或JSON文档转换为关系表。实验结果表明,本文的方法可以为从层次结构数据到关系数据的转换任务生成所需的程序,实现数据集中的数据转换。The process of combining and storing data from multiple data sources and establishing a data warehouse is an important step in web data integration. Data integration achieves integration through data transformation, and mainly solves the problems of data distribution and heterogeneity. Many applications store and transfer data using a hierarchical structure. This tree-based hierarchical model is well suited to the underlying data, so hierarchical data formats are popular for exporting data and transferring data between applications. In order to facilitate storage and query, it is usually necessary to convert such hierarchical data into relational representation. However, due to the characteristics of hierarchical data and relational data and the large data sources that need to be processed, this conversion process brings a lot of difficulties. workload. To address this issue, this paper adopts an example-based programming approach for migrating hierarchically structured documents to a relational format. By proposing a program synthesis algorithm, the task of synthesizing relational tables is decomposed into two sub-tasks of column extraction and row ex-traction, learning target conversion from input and output examples, and converting XML documents or JSON documents into relational tables. The experimental results show that the method in this paper can generate the required programs for the transformation task from hierarchical data to relational data, and realize the data transformation in the dataset.

关 键 词:数据转换 层次模型 应用程序 层次结构 文档转换 示例学习 数据仓库 数据集成 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象