检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王剑辉[1]
机构地区:[1]沈阳师范大学科信软件学院
出 处:《长安大学学报(自然科学版)》2007年第1期107-110,共4页Journal of Chang’an University(Natural Science Edition)
摘 要:对欧洲议会的会议纪要文本,采用N重评价法、Garbling模型和编辑距离法3种方法对文本进行了自动识别,并比较了测试结果。评价结果表明:N重评价法虽然对纠正非词错误不适合,但可以在纠正其他错误中考虑使用;Garbling模型法的纠错结果总的来说是好的,但不适合所有的错误类型;编辑距离法对纠正非词错误能得到最好的结果。3种方法的合理有效组合,能完善测试结果。Three different approaches for the automatic text recognition are implemented and the test results are compared with each another. The used corpora are the European Parliament's conference summary. The implemented approaches are the evaluation of Bi or Trigram, the Garbling model and the Edit-Distance approach. The evaluation shows that the evaluation of N-Gram is unsuitable for correction of non-word errors, however can be applied to the recognition and correction of other errors. The Garbling model obtains good correction results, in general, but does not suit all error types. The Edit-Distance approach achieves the best results by the correction of non-word errors. The results can be improved by meaningful combination of these approaches. 1 fig, 8 refs.
分 类 号:TP39[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.221.40.13