基于英汉平行语料库的术语组块自动抽取  被引量:2

Automatic Extraction of Term Chunks Based on Parallel Corpora of English and Chinese

在线阅读下载全文

作  者:杨福义[1] YANG Fuyi

机构地区:[1]鞍山师范学院,辽宁鞍山114006

出  处:《中国科技术语》2018年第2期12-17,共6页CHINA TERMINOLOGY

摘  要:双语平行语料库的数据资源建设是语言工程的前端。其中包含大量的术语及语言翻译知识。深入研究和开发双语语料库,对术语翻译具有重要意义。文章论述了平行语料库的深加工流程和中文语料标注的自动化加工。使用"语法符号语言"建立文本的语法映像,生成短语组块库。按短语结构规则采用人工智能方法自动抽取术语翻译组块,自动生成术语组块词典与词表,列出部分术语组块查询应用的实例和逆向追踪双语例句的实例。The construction of data resources of bilingual parallel corpora is the front end of language engineering,and contains a large number of terms and language translation knowledge.Full use of bilingual corpora for further research and development is of great significance to terminology translation.This article discusses the deep processing flow of parallel corpora and automatic processing of Chinese corpus annotation.Using the grammar symbol language,the grammar image of the text is set up,and the phrase chunk library is generated.According to the rules of phrase structure,the term translation chunk is automatically extracted by the method of artificial intelligence,and the lexicon and thesaurus of term chunks are automatically generated.Moreover,some examples of the application of terminology block query and examples of reverse tracing bilingual examples are listed.

关 键 词:计算术语学 语料库 知识抽取 术语部件 组块 

分 类 号:H059[语言文字—语言学] H083

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象