检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:朱君辉 王梦焰 杨尔弘[1,2] 聂锦燃 杨麟儿 王誉杰 ZHU Junhui;WANG Mengyan;YANG Erhong;NIE Jinran;YANG Lin er;WANG Yujie(National Language Resource Monitoring and Research Center Print Media Language Branch,Beijing Language and Culture University,Beijing 100083,China;School of Information Science,Beijing Language and Culture University,Beijing 100083,China;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)
机构地区:[1]北京语言大学国家语言资源监测与研究平面媒体中心,北京100083 [2]北京语言大学计算科学学院,北京100083 [3]北京交通大学计算机与信息技术学院,北京100044
出 处:《中文信息学报》2024年第4期17-27,共11页Journal of Chinese Information Processing
基 金:教育部人文社科青年基金(23YJCZH264);国家语委重大科研项目(ZDA145-17)。
摘 要:近年,人工智能的语言生成技术突飞猛进,基于自然语言生成技术的聊天机器人ChatGPT能够自如地与人对话、回答问题。为了探究机器生成语言与人类语言的差异,该文分别收集了人类和ChatGPT在中文开放域上3293个问题的回答作为语料,对两种语料分别提取并计算描述性特征、字词常用度、字词多样性、句法复杂性、语篇凝聚力五个维度上的161项语言特征,利用分类算法验证用这些特征区别两种语言的有效性,并考察、对比这些特征来阐释人类、机器生成两种语言的异同。研究结果发现,两种文本在描述性特征、字词常用度、字词多样性三个维度的77项语言特征上存在显著差异,相较于机器回答语言,人类回答语言表现出易读性高、论元重叠度低、口语色彩明显、用词丰富多样、互动性强等特点。Recent advancements in artificial intelligence have led to significant strides in language generation technologies,with chatbots like ChatGPT demonstrating proficiency in conversation and question answering.This paper investigates the differences between machine-generated language and human language by analyzing responses to 3293 open-domain Chinese questions from humans and ChatGPT.The analysis examines 161 linguistic features in five dimensions:descriptive characteristics,word frequency,lexical diversity,syntactic complexity,and discourse cohesion.Classification algorithms are employed to assess the efficacy of these features in distinguishing between the two types of language.The results reveal significant differences in 77 linguistic features across descriptive characteristics,word frequency,and lexical diversity.Human language tends to exhibit higher readability,lower argument overlap,a more colloquial style,a richer vocabulary,and greater interactivity compared to machine-generated language.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.120