检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]贵州大学计算机科学与信息学院,贵阳550025 [2]日本东京农工大学
出 处:《计算机工程与应用》2010年第20期116-118,共3页Computer Engineering and Applications
摘 要:研究了一种有效的词典驱动的联机手写日文病名识别方法。病名词典以树结构存储,包含21713个病名短语。在切分中,手写病名字符串通过分析相邻笔划之间的空间信息等特征被切分为原始的片段序列。连续的片段动态地合并为候选字符模式,不同的合并方式产生不同的候选字符序列,这样可构成一个切分候选网格。在识别过程中,结合病名词典匹配来限制候选字符模式的类别扩展,采用集束搜索策略来寻找到一条最优路径作为识别结果。用500个实际的手写病名样本做实验,平均每个病名的识别时间为0.87s,识别正确率为83.16%。This paper studies an effective lexicon driven recognition method for on-line handwritten Japanese disease name recognition.The lexicon contains 21,713 disease name phrases,which are stored in a Trie structure.In segmentation,an online handwritten disease name string inputted is over-segmented into primitive segments according to the features such as spatial information between adjacent strokes.Then one or more consecutive primitive segments form a candidate character pattern. The combination of all candidate patterns is represented by a segmentation candidate lattice,where each node denotes a segmentation point and each arc denotes a candidate character pattern.In recognition,this paper uses the beam search strategy to find an optimal segmentation and recognition result,with restricting the candidate character class of each candidate character pattern by the disease lexicon structured into Trie.The algorithm is tested on 500 actual handwritten disease name samples, the average time for processing a disease name is 0.87 second and the recognition rate is 83.16%.
关 键 词:病名识别 词典驱动识别 手写字符串识别 集束搜索
分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.17.212