基于信息丰富度的切碎中文文档自动拼接复原被引量：5

Automatic Reconstruction of Cross-Cut Chinese Documents Using Information Quantity

作　　者：赵波[1,2] 周宇[1] 张正宇[1,3] 那莹[1] 马廷淮[1]

机构地区：[1]南京信息工程大学计算机与软件学院,南京210044 [2]北京大学计算机科学技术研究所,北京100080 [3]中国科学院计算技术研究所,北京100190

出　　处：《计算机辅助设计与图形学学报》2015年第6期1039-1046,共8页Journal of Computer-Aided Design & Computer Graphics

基　　金：国家自然科学基金(61173143);公益性行业(气象)科研专项(GYHY201506080)

摘　　要：针对切碎中文文档的自动拼接复原中无法利用碎纸片形状特征的问题,提出一种基于内容信息丰富度的拼接算法.首先分析了基于汉字内容的碎纸片特征表达方式;在此基础上,提出从横纵2个方面进行碎纸片特征匹配度估计的方法;最后采用信息丰富度确定拼接次序,逐一高效地完成碎纸片的拼接.基于不同碎纸片数量的匹配实验结果表明,相对于传统方法,横纵特征匹配度估计方法分别提高了约4.73%,3.76%的准确度;自动拼接复原实验结果表明,相对于传统算法,基于信息丰富度拼接算法的错误率下降约18%,并大大降低了时间复杂度.Considering the lack of shape character in reconstruction of cross-cut Chinese documents, an in- formation quantity based automatic reconstruction algorithm is proposed in this paper. First, we analyze how to describe the feature of shreds based on Chinese characters. Then, a new evaluation method of feature matching is presented, which consists of horizontal and vertical two aspects. Finally, an automatic recon- struction algorithm is designed according to the orders which are decided by information quantity. Experi- ments on different scales of shreds show that the accuracy of proposed method is improved about 4.73% and 3.76% respectively on horizontal and vertical, compared with traditional methods. For automatic reconstruction of shreds, it indicates that proposed information quantity based automatic reconstruction algorithm decreases the error rate by 18% and the time complexity greatly, compared with traditional algorithms.

关键词：文档复原中文文档碎纸片匹配度估计信息丰富度自动拼接算法

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于信息丰富度的切碎中文文档自动拼接复原被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于信息丰富度的切碎中文文档自动拼接复原 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于信息丰富度的切碎中文文档自动拼接复原被引量：5