基于贝斯准则和待定词集模糊矩阵的满文识别后处理  被引量:1

Manchu Character Recognition Post-Processing Based on Bayes Rules and Substitution Set Confusion Matrix

在线阅读下载全文

作  者:李晶皎[1] 赵骥[1] 

机构地区:[1]东北大学信息科学与工程学院,辽宁沈阳110004

出  处:《东北大学学报(自然科学版)》2004年第11期1061-1064,共4页Journal of Northeastern University(Natural Science)

基  金:辽宁省自然科学基金资助项目(2001113)

摘  要:将满文单词识别系统的识别信息和满文的词组信息有机地结合起来,建立满文词组和待定词集统计信息库,利用贝叶斯准则,综合满文待定词的后验概率和词组的先验概率信息,建立合理有效便于实现的数据结构,对满文单词识别系统输出存在的拒识词和错识词进行检测和纠正,从而有效地提高满文识别系统的识别率·实验表明:后处理性能除取决于语言模型外,还取决于后概率的精确估计·另外,在单词识别系统识别率高的情况下,后处理的纠错能力会增强·After combining of organically the recognition information on single Manchu characters from relevant system with the information on phrases to set up a statistical information database of Manchu phrases and underdetermined word sets, Bayes rules are used to synthesize the prior probability of underdetermined Manchu word sets and posterior probability of phrases. A data construction is thus developed to improve efficiently the recognition rate, which is rational and easy to implement especially available to detect and correct those rejected and incorrectly recognized words output from the SCR single character recognition system. Experiment shows that the post-processing performance depends on not only the language model but the accurate estimate of posterior probability. In addition, the higher the recognition rate of SCR, the stronger the rectifiability of post-processing.

关 键 词:满文 后处理 待定词集 模糊矩阵 贝叶斯准则 特征矢量 词组库 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象