检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]河北大学数学与计算机学院,河北保定071002
出 处:《计算机应用》2007年第8期2036-2037,2065,共3页journal of Computer Applications
基 金:河北省科学技术研究与发展计划资助项目(06213598)
摘 要:公式抽取是印刷体数学公式识别的基础性环节,现有的识别方法多以公式区域已知为前提,相关的研究还很欠缺。通过引入模糊分类理论,提出了一种孤立数学公式的抽取算法,通过对大量训练样张的数据统计与分析,选取了非规则度、宽高比、密度等6维特征,由此构建出对孤立公式行、文本行、标题行的模糊分类规则,实现了孤立公式行的抽取。实验结果表明,该方法有较高的准确性和鲁棒性。Process of mathematical formula extraction from printed document is a basal step. Most of the available extraction methods assume that the regions containing mathematical formulas are known. An algorithm to extract isolated mathematical formulas by introducing fuzzy classification theory was described. Six features, such as degree of irregularity, width-to-height ratio and density ect, were selected from lots of data that came from training samples counted and analyzed, thereby the rule of fuzzy classification was built to handle isolated mathematical formula lines, text lines and tide lines, so mathematical formula extraction was realized. The experimental results indicate that this method could obtain favorable veracity and good robusmess.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222