检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]大连理工大学应用数学系,辽宁大连116024
出 处:《大连理工大学学报》2006年第3期454-459,共6页Journal of Dalian University of Technology
基 金:国家自然科学基金资助项目(19971012);国防科工委国防基础科研基金资助项目(J1700B002);辽宁省学科带头人基金资助项目
摘 要:当前的OCR(optica l character recogn ition)系统对手写、打印文本都有很高的识别率,但是缺少对数学公式的结构进行分析及重组的功能.为此,将程序设计语言编译程序的基本设计方法用于数学公式的结构分析.重点介绍了上下标的定位、基于LL(1)文法的表达式构成规则和公式结构分析器的设计,并简略介绍了基于神经网络的数学符号识别方法.对于印刷体科学文献中的数学表达式,先通过预处理和分类过程识别每一个数学符号,得到按左边界排序的一串字符.然后通过结构分析器,进行上下标的定位以及前后关系的确定.最后把结构分析器生成的语法树转换成可编辑的L aT ex格式.实例证明得到了比较满意的结果.The current optical character recognition (OCR) has the high efficiency of indentification for the handwriting and the printed texts, but it hasn't the function to analyse and recombine the mathematical expressions. A method of understanding mathematical expressions by the basic design method of programmig is proposed. Mainly discussed here are the method of locating superscripts and subscripts, the LL(1) grammar structure of mathematical expressions, and the structure analyzer. The recognition process is briefly described using neural networks. To understand the mathematical expressions in a printed scientific document, the pretreatment, character segmentation and recognition are performed, ending up with a series of characters sorted by left border. Then a structure analyzer is used to determine the location of subscripts and superscripts and the relative positions. Finally, the grammar tree produced by the structure analyzer is transfered into a LaTex document. Quite satisfactory experimental results were obtained.
关 键 词:公式重构 结构分析 模式识别 LL(1)文法 神经网络
分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222