发现程序与分布:语言习得有效模型——从ChatGPT语言学习说起  

Discovery procedure and distribution:Effective Model to language acquisition——Starting from ChatGPT language learning

在线阅读下载全文

作  者:陈保亚[1] 陈樾 CHEN Baoya;CHEN Yue

机构地区:[1]北京大学中国语言学研究中心/中文系 [2]佐治亚大学

出  处:《华文教学与研究》2025年第1期1-8,共8页TCSOL Studies

基  金:国家社科基金重大项目“我国民族音乐文化与语言数据集成及共演化研究”(22&ZD218)。

摘  要:ChatGPT的出现引人关注,其最显著的进展应该是自然语言文本的生成。ChatGPT能够生成崭新的合法句子,说明ChatGPT已经获得了自然语言文本的单位和生成规则。ChatGPT不需要和经验打交道,不具备“酸、甜、苦、辣、痛、悲、愁”等词汇背后的经验,却能生成包含这些词的合法句子。这是语言学家、人工智能专家和哲学家需要解释的重要理论问题。决定语言规则的初始概念包括词、词类、语法结构关系、语义结构关系、语用结构关系等,ChatGPT由于是自动学习规则,并未利用这些初始概念,它唯一能够利用的是大规模文本中自然片段的分布。通过文本中自然片段的分布获取单位和规则,这是一种言知而非亲知的学习方式。ChatGPT言知学习方式的成功证明了结构语言学家Harris的发现程序及其核心部分的分布理论具有可行性,也为语言形式主义的可行性提供了证据。ChatGPT基于言知的学习模式需要以大数据和超强运算这样一种强储算能力为基础,还未揭示人类基于亲知的语言学习机制,人类学习依赖的是基于小数据和基本运算这样一种弱储算能力。不过基于言知的学习模式所依赖的发现程序及其分布理论对人类学习语言仍然有必要性。亲知学习方式和言知学习方式都是语言学家需要回答的问题。The emergence of ChatGPT has attracted much attention,and its most significant progress should be the generation of natural language text.ChatGPT can generate entirely new and legitimate sentences,indicating that it has acquired the rules for generating natural language text.Without dealing with experience,ChatGPT lacks the experience behind vocabulary such as“sour,sweet,bitter,spicy,painful,sad,melancholy,”yet it can still generate legitimate sentences containing these words.This is an important theoretical issue that linguists,artificial intelligence experts,and philosophers need to explain.The initial concepts that determine language rules include word classes,grammatical structural relations,semantic structural relations,pragmatic structural relations,etc.Since ChatGPT automatically learns rules without utilizing these initial concepts,the only thing it can utilize is the distribution of natural segments(tokens)in large-scale text.Acquiring units and rules through the distribution of natural segments is a way of knowing through language rather than through direct experience.The success of ChatGPT's way of knowing through language proves the feasibility of the discovery procedure and its core components,the theory of distribution,proposed by structural linguist Harris.It also provides evidence for the feasibility of linguistic formalism.ChatGPT's learning model based on knowing through language requires a strong storage and computing capability based on big data and supercomputing,which has not yet revealed the human mechanism of language learning based on direct experience.Human learning relies on a weak storage and computing capability based on small data and basic operations.However,the discovery procedure and its theory of distribution relied upon knowing through language are still necessary for human language learning.Both ways of knowing through direct experience and knowing through language are issues that linguists need to address.

关 键 词:玛丽房间 元语言 初始概念 发现程序 

分 类 号:H08[语言文字—语言学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象