检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:江荻[1,2] Jiang Di
机构地区:[1]中国社会科学院民族学与人类学研究所 [2]江苏师范大学语言科学与艺术学院.
出 处:《复印报刊资料(语言文字学)》2022年第8期97-110,共14页LINGUISTICS AND PHILOLOGY
基 金:国家社会科学基金重大项目“中国民族语言大规模语法标注文本在线检索系统研制与建设研究”(21&ZD304)。
摘 要:本文回顾了学界对汉语方言之间相互关系的三种计量方法:特征统计、词源统计和词汇相似度计量,指出这三种计量方法采用的是非整体的、语音和词汇上受限的考察方法。文章阐述了一种更适用的计算模型,即Levenshtein Distance算法(莱文斯坦距离,或称编辑距离),该方法对语言或方言之间线性字符串的语音相似性和词汇对应性具有协调功能,并蕴含特征比对和词源概率效用。本文自动分区实验汇集了南方吴、闽、粤、湘、客、赣、徽、淮8个分区的78个方言,官话方言有东北、北京、冀鲁、胶辽、中原、兰银、西南108个方言,共计186个汉语方言点。每个方言收集了斯瓦迪士100个基本词,并对方言之间展开相似性计算。计算结果与传统分区基本一致,但更为精准。This paper reviews three measuring methods of the relationships between Chinese dialects:feature statistics,etymological statistics and lexical similarity measures,pointing out that these three measures employ a non-holistic,phonetically and lexically constrained methods of examination.This paper expounds a more applicable calculation model,the Levenshtein Distance algorithm(or Edit Distance),which has an integrated and coordinated function for phonological similarity and lexical correspondence of linear strings between languages or dialects,and implies feature comparison and etymological probability utilities.The automatic dialect classifying experiments in this paper collect 78 dialects from eight districts of Wu,Min,Yue,Xiang,Ke,Gan,Hui and Huai in the South China,and 108 dialects from eight divisions of Mandarin,namely Dialects of Dongbei,Beijing,Ji-lu,Jiao-Liao,Zhongyuan,Lan-Yin,Xinan and Jin Dialect,for a total of 186 Chinese dialects.Swadesh's 100 basic words were collected for each dialect and similarity calculations were carried out between the dialects.The calculation results are basically consistent with the traditional partitioning,but more precise.
分 类 号:H17[语言文字—汉语] TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.142.211.95