多语言机译系统中高质量语义单元库形成方法  被引量:1

Formation method of a high-quality semantic unit base for a multi-language machine translation system

在线阅读下载全文

作  者:胡玥[1] 高小宇[1] 高庆狮[1] 

机构地区:[1]北京科技大学信息工程学院,北京100083

出  处:《北京科技大学学报》2008年第6期698-704,共7页Journal of University of Science and Technology Beijing

基  金:国家高技术研究发展计划资助项目(No.2006AA01Z140;No.2006AA010101);国家自然科学基金资助项目(No.60736014)

摘  要:讨论构建多自然语言互译机译系统所需的高质量、可扩充、完备的、无可弃、无重复、无非正常歧义的多语统一语义单元知识库.在构建过程中采用类型特征分类方法有效降低计算复杂性,使去重复的计算量降低一半,去可弃的计算量降到O(βN)(N是语义单元库规模,β是有界数,β<C,C是常数).全部算法都可以在多核处理机上以常数效率地实现.同时讨论了语义单元的再分解和自然语言种类的增多时语义单元知识库的扩充方法.该知识库不仅用于多自然语言互译系统,还可作为自然语言理解和处理的基础知识库.Building up a high-quality, expandable, complete, free-discardable, free-of-repetition and free-of-abnormal-ambiguity multi-language semantic unit knowledge base for a multi-language machine translation system was discussed. In the process of buildup, the type feature classification method was adopted o effectively reduce the calculation complexity, make the calculation for repetition removal reduced by half, and reduce the trash-removal calculation to O (βN), where N is the scale of the semantic unit knowledge base, fl is bounded, β〈 C and C is a constant. All algorithms can be concurrently realized on a multi-core processor in constant efficiency. Furthermore, the reecomposition of a semantic unit and the expansion methods for the semantic unit knowledge base in case of natural language type increase were also discussed. This knowledge base can be used not only for the multi-language machine translation system but also as the basic knowledge base for natural language understanding and processing.

关 键 词:自然语言处理系统 自然语言 机器翻译 语义单元 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象