词典驱动的联机手写日文病名识别研究  

Lexicon driven approach for on-line handwritten Japanese disease name recognition

在线阅读下载全文

作  者:梁建娟[1] 朱碧兰 刘本永[1] 中川正樹 

机构地区:[1]贵州大学计算机科学与信息学院,贵阳550025 [2]日本东京农工大学

出  处:《计算机工程与应用》2010年第20期116-118,共3页Computer Engineering and Applications

摘  要:研究了一种有效的词典驱动的联机手写日文病名识别方法。病名词典以树结构存储,包含21713个病名短语。在切分中,手写病名字符串通过分析相邻笔划之间的空间信息等特征被切分为原始的片段序列。连续的片段动态地合并为候选字符模式,不同的合并方式产生不同的候选字符序列,这样可构成一个切分候选网格。在识别过程中,结合病名词典匹配来限制候选字符模式的类别扩展,采用集束搜索策略来寻找到一条最优路径作为识别结果。用500个实际的手写病名样本做实验,平均每个病名的识别时间为0.87s,识别正确率为83.16%。This paper studies an effective lexicon driven recognition method for on-line handwritten Japanese disease name recognition.The lexicon contains 21,713 disease name phrases,which are stored in a Trie structure.In segmentation,an online handwritten disease name string inputted is over-segmented into primitive segments according to the features such as spatial information between adjacent strokes.Then one or more consecutive primitive segments form a candidate character pattern. The combination of all candidate patterns is represented by a segmentation candidate lattice,where each node denotes a segmentation point and each arc denotes a candidate character pattern.In recognition,this paper uses the beam search strategy to find an optimal segmentation and recognition result,with restricting the candidate character class of each candidate character pattern by the disease lexicon structured into Trie.The algorithm is tested on 500 actual handwritten disease name samples, the average time for processing a disease name is 0.87 second and the recognition rate is 83.16%.

关 键 词:病名识别 词典驱动识别 手写字符串识别 集束搜索 

分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象