大模型生成回答与人类回答文本的语言特征比较研究  被引量:3

A Comparative Study of Language between Artificial Intelligence and Human:A Case Study of ChatGPT

在线阅读下载全文

作  者:朱君辉 王梦焰 杨尔弘[1,2] 聂锦燃 杨麟儿 王誉杰 ZHU Junhui;WANG Mengyan;YANG Erhong;NIE Jinran;YANG Lin er;WANG Yujie(National Language Resource Monitoring and Research Center Print Media Language Branch,Beijing Language and Culture University,Beijing 100083,China;School of Information Science,Beijing Language and Culture University,Beijing 100083,China;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)

机构地区:[1]北京语言大学国家语言资源监测与研究平面媒体中心,北京100083 [2]北京语言大学计算科学学院,北京100083 [3]北京交通大学计算机与信息技术学院,北京100044

出  处:《中文信息学报》2024年第4期17-27,共11页Journal of Chinese Information Processing

基  金:教育部人文社科青年基金(23YJCZH264);国家语委重大科研项目(ZDA145-17)。

摘  要:近年,人工智能的语言生成技术突飞猛进,基于自然语言生成技术的聊天机器人ChatGPT能够自如地与人对话、回答问题。为了探究机器生成语言与人类语言的差异,该文分别收集了人类和ChatGPT在中文开放域上3293个问题的回答作为语料,对两种语料分别提取并计算描述性特征、字词常用度、字词多样性、句法复杂性、语篇凝聚力五个维度上的161项语言特征,利用分类算法验证用这些特征区别两种语言的有效性,并考察、对比这些特征来阐释人类、机器生成两种语言的异同。研究结果发现,两种文本在描述性特征、字词常用度、字词多样性三个维度的77项语言特征上存在显著差异,相较于机器回答语言,人类回答语言表现出易读性高、论元重叠度低、口语色彩明显、用词丰富多样、互动性强等特点。Recent advancements in artificial intelligence have led to significant strides in language generation technologies,with chatbots like ChatGPT demonstrating proficiency in conversation and question answering.This paper investigates the differences between machine-generated language and human language by analyzing responses to 3293 open-domain Chinese questions from humans and ChatGPT.The analysis examines 161 linguistic features in five dimensions:descriptive characteristics,word frequency,lexical diversity,syntactic complexity,and discourse cohesion.Classification algorithms are employed to assess the efficacy of these features in distinguishing between the two types of language.The results reveal significant differences in 77 linguistic features across descriptive characteristics,word frequency,and lexical diversity.Human language tends to exhibit higher readability,lower argument overlap,a more colloquial style,a richer vocabulary,and greater interactivity compared to machine-generated language.

关 键 词:ChatGPT 人类语言 语言特征 机器学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象