检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《计算机工程与应用》2003年第28期75-78,共4页Computer Engineering and Applications
基 金:国家自然科学基金"数学天元基金"资助(编号:TY10026002-04-04-01)
摘 要:印刷体数学公式与普通文本相比有许多不同的特点,其二维结构决定了公式识别不仅包含字符识别,更重要的是对其结构的分析。上/下标关系是公式中出现频繁又难于解决的特殊结构,容易与水平关系混淆。该文提出两种基于统计特征的印刷体数学公式上/下标关系判别方法,一种直接分析符号的外接矩形,另一种利用了符号的识别结果。实验结果表明,两种方法与同类方法相比都有改进,其中利用识别结果进行判别的方法不仅能将上/下标与水平关系很好地区分开,而且具有很大的类间距离。Typeset mathematical expression has a number of features which distinguish it from conventional text.Its two-dimension structure implies that automatic mathematical expression recognition includes both symbol recognition and structure analysis.The superscript and/or subscript relations between symbols appear frequently,and often confuse them-selves with the relation of the same line.Two methods have been proposed to distinguish superscript/subscript from same line relation between symbols in typeset mathematical expressions based on statistic features.The first method uses the outline rectangles of symbols to analyze the relations.The second method analyzes the relations according to symbols' recognition result.Experiments show that both methods are more efficient than previous methods.Especially,the second method not only can distinguish the relations very well,but also has large inter-cluster distance.
关 键 词:数学公式识别 上/下标判别 统计特征 文档图像处理
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222