基于统计特征的印刷体数学公式上/下标关系判别  被引量:10

Determine Superscript /Subscript Relations in Typeset Mathematical Expressions Based on Statistic Features

在线阅读下载全文

作  者:江红英[1] 靳简明[1] 王庆人[1] 

机构地区:[1]南开大学机器智能研究所,天津300071

出  处:《计算机工程与应用》2003年第28期75-78,共4页Computer Engineering and Applications

基  金:国家自然科学基金"数学天元基金"资助(编号:TY10026002-04-04-01)

摘  要:印刷体数学公式与普通文本相比有许多不同的特点,其二维结构决定了公式识别不仅包含字符识别,更重要的是对其结构的分析。上/下标关系是公式中出现频繁又难于解决的特殊结构,容易与水平关系混淆。该文提出两种基于统计特征的印刷体数学公式上/下标关系判别方法,一种直接分析符号的外接矩形,另一种利用了符号的识别结果。实验结果表明,两种方法与同类方法相比都有改进,其中利用识别结果进行判别的方法不仅能将上/下标与水平关系很好地区分开,而且具有很大的类间距离。Typeset mathematical expression has a number of features which distinguish it from conventional text.Its two-dimension structure implies that automatic mathematical expression recognition includes both symbol recognition and structure analysis.The superscript and/or subscript relations between symbols appear frequently,and often confuse them-selves with the relation of the same line.Two methods have been proposed to distinguish superscript/subscript from same line relation between symbols in typeset mathematical expressions based on statistic features.The first method uses the outline rectangles of symbols to analyze the relations.The second method analyzes the relations according to symbols' recognition result.Experiments show that both methods are more efficient than previous methods.Especially,the second method not only can distinguish the relations very well,but also has large inter-cluster distance.

关 键 词:数学公式识别 上/下标判别 统计特征 文档图像处理 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象