检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中国科学院自动化研究所
出 处:《计算机研究与发展》2007年第7期1144-1150,共7页Journal of Computer Research and Development
基 金:国家"八六三"高技术研究发展计划基金项目(2006AA01Z153)
摘 要:提出了一种基于多候选方法的数学公式识别系统.该系统主要包括公式图像预处理,多候选公式符号分割和多候选公式结构分析3个部分.在公式符号切分中,使用3次动态规划方法对公式图像进行多候选公式符号切分.在公式结构分析中,采用层次结构方法多候选分析公式符号间的结构关系,然后使用LaTex格式和MathType格式表示数学公式的识别结果.为了确定符号间的空间位置关系,建立了符号的空间关系模型.在3268个公式图像组成的测试集上取得了78.2%的公式分析正确率.A multi-candidate mathematical expression (ME) recognition system is proposed. The system includes three main components: image preprocessing, symbol segmentation and structure analysis. During the symbol segmentation period, a three-stage segmentation method based on dynamic programming (DP) is proposed. In the initial segmentation based on DP algorithm, the ME image is segmented into several blocks. In the vertical segmentation, each block is segmented into some blobs. In the horizontal segmentation based on DP algorithm, every blob is segmented into symbols. During the structure analysis period, hierarchical structure is adopted to analyze structure of ME. The hierarchical structure analysis method consists of three steps, i.e., matrix analysis, sub-expression analysis and script expression analysis. In matrix analysis (sub-expression analysis), an ME is decomposed into several basic matrixes (basic sub-expressions) and some sub-expressions (script expressions) by reconstructing the ME (sub-expression) global structure, and then every basic matrix (sub-expression) is analyzed from bottom to up. In script analysis, a graph rewriting algorithm is adopted to build script relation trees among symbols within a script expression. A spatial relation model is built to calculate spatial relations' confidence between two symbols. The experiments are implemented on a database with 3268 ME images and the results show that the proposed system works well. Top-1 ME recognition accuracy reaches 78.2 %.
关 键 词:多候选 印刷体数学公式 动态规划 层次结构 空间关系模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222