一种基于文档模式的GML压缩方法  被引量:2

A Schema-Based Approach to GML Compression

在线阅读下载全文

作  者:魏勍颋[1,2] 关佶红[1] 周水庚[3,4] 

机构地区:[1]同济大学电子与信息工程学院,上海201804 [2]南昌大学软件学院,南昌330046 [3]上海市智能信息处理重点实验室(复旦大学),上海200433 [4]复旦大学计算机科学技术学院,上海200433

出  处:《计算机研究与发展》2011年第9期1704-1713,共10页Journal of Computer Research and Development

基  金:国家自然科学基金项目(60873040);国家"八六三"高技术研究发展计划基金项目(2009AA01Z135)

摘  要:GML已成为地理空间数据编码的事实标准.GML文档一般体积庞大,存储和传输时占用巨额资源.提出了一种基于文档模式的有效GML压缩方法,通过用文档推导出的模式验证文档本身,对树自动机的状态转换路径进行比特编码,对坐标数据增量编码,实现GML文档压缩.对真实GML文档的压缩实验表明,所提出方法的压缩率优于通用文本压缩器(gzip和PPMD)、主要高性能XML压缩器(XMill,XMLPPM和XWRT)以及现有GML压缩器GPress.GML, an XML-based geographic modeling language, has become a de facto encoding standard for geospatial data. Usually, GML documents are extremely verbose because of highly frequent repeating structures like tags and attribute names, which contributes to the self-describing advantage of GML data. Besides, GML documents are rich of data, having many space-consuming textual data items, including attribute values and element contents. What is worse, there often exists a great amount of high-precision spatial coordinate data in text format that occupies more storage space than in binary format. Hence it is very costly to store and transfer GML documents. An effective schema-based approach to GML compression is proposed, which compresses a GML document by first inferring a schema from the document, validating the document against the schema inferred from the document itself, and then encoding the state transition paths of the tree automaton by bits, compressing the coordinate data via the delta encoding scheme, and forwarding the inferred schema and all encodings to the general text compressors finally. Experiments on real GML documents show that the proposed compressor outperforms both typical general text compressors (gzip and PPMD), and the state-of-the-art XML compressors (including XMill, XMLPPM, XWRT), as well as the GML compressor GPress in compression ratio.

关 键 词:XML压缩 GML压缩 模式 增量编码 树自动机 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象