检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]南京航空航天大学计算机科学与技术学院,南京210016
出 处:《计算机科学与探索》2015年第12期1439-1449,共11页Journal of Frontiers of Computer Science and Technology
基 金:江苏省自然科学基金~~
摘 要:随着大数据时代的到来,大规模XML文件不断地涌现,其信息庞大,结构复杂,而传统的XML查询匹配技术需要大量的存储空间和预解析工作,不能有效解决XML大文件的匹配要求。针对这种现状,分析了现有经典匹配算法核心思想,并结合多线程并行相关知识,提出了一种新的并行的XML数据流模式匹配算法,称为并行路径流算法(parallel path stream,PPS)。该算法在以流模式顺序解析XML文件的过程中,缓存以查询模式根元素为根节点的子树,以顺序链表存储节点的编码信息,在进行有效过滤后加入任务链表中,采用独特的匹配方法并行操作任务池中的各个顺序链表后得到匹配结果。实验表明,该算法能够明显减少存储空间,其过滤过程和并行操作能够有效减少匹配时间,并在查询路径长度方面具有一定优势。With the arrival of the big-data era, XML files with large scale, huge information and complicated structure have emerged constantly. But the previous XML query matching algorithms can’t effectively solve the matching problem because they need lots of storage space and preliminary parse work. This paper proposes a new parallel XML stream data pattern matching algorithm named PPS(parallel path stream), which is based on the analysis of the core idea of the existing classic matching algorithm and integrates the knowledge on multi-thread parallel. In the process of sequentially parsing XML file as stream pattern, subtree, whose root is the root element of the query pattern, is stored as ordered list and is added into the task list after an effective filtration. Then the matching result can be gotten through parallel operating on the ordered list. The experimental results demonstrate that the PPS can significantly reduce the storage space, the filtration process and the parallel operation can effectively reduce the matching time, beyond that, the PPS has certain advantage in the query path length.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.103.40