机器生成语言的质量评价方法综述  被引量:7

A survey on quality evaluation of machine generated texts

在线阅读下载全文

作  者:秦颖[1] QIN Ying(Artificial Intelligence and Human Languages Laboratory,Beijing Foreign Studies University,Beijing 100089,China)

机构地区:[1]北京外国语大学人工智能与人类语言实验室,北京100089

出  处:《计算机工程与科学》2022年第1期138-148,共11页Computer Engineering & Science

基  金:北京外国语大学校级科研基金(2020SYLZDXM040)。

摘  要:生成语言的质量评价很大程度上影响着自然语言生成的研究,已成为制约该领域发展的瓶颈问题。通过对机器翻译、自动文摘、对话系统、图像标题生成和机器写作等广义自然语言生成任务的语言质量评价方法的汇总,介绍了人工评价和自动评价的特点、优缺点和开放评价资源,分析了不同任务的不同评价角度和适用面。不同评价方法的对比分析,可为方法融合和关键问题的探索提供借鉴。整体上机器生成语言质量评价还局限于语言形式的比较,在语义表达的准确性、衔接连贯性等深层评价上存在诸多挑战。结合评价难点问题和现有研究的推进情况,分析了生成语言质量评价的研究趋势。The quality evaluation of machine generated texts largely affects the research of Natural Language Generation(NLG),and has become a bottleneck restricting the development of the field.This paper reviews on the quality evaluation of various NLG tasks in a broad sense including machine translation,automatic summarization,dialogue,image captioning and machine writing with thorough summarization.The paper introduces the features,pros and cons of human evaluation and automatic metrics respectively as well as some open evaluation resources.This review analyzes the different perspective and applications of various evaluation tasks.The comparative analysis of different evaluation methods can provide reference for method fusion and exploration of key issues.Overall,the quality evaluation of machine-generated language is still limited to the superficial comparison of linguistic forms,and there are many challenges in deeper evaluation at the level of semantic and coherence or cohesion.Based on the analysis of difficulties and current developments,the paper proposes the research tendencies of quality evaluation of generated texts.

关 键 词:生成语言质量评价 机器翻译 自动文摘 对话系统 图像标题生成 故事生成 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象