检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:田地 邵玉斌[1] 杜庆治[1] 龙华[1] 马迪南 TIAN Di;SHAO Yu-bin;DU Qing-zhi;LONG Hua;MA Di-nan(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Key Laboratory of Media Integration of Yunnan Province,Kunming 650032,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]云南省媒体融合重点实验室,昆明650032
出 处:《兰州大学学报(自然科学版)》2024年第3期350-356,共7页Journal of Lanzhou University(Natural Sciences)
基 金:云南省媒体融合重点实验室项目(320225403)。
摘 要:针对中文词语边界不明确,词语和句子上下文关系被忽略的问题,设计一种基于多种分词情况的歧义分词信息抑制算法.在预处理中根据预训练的词汇频率表计算语句中不同分词的权重,将最有可能的分词情况与其他分词情况进行区分,合并至语句中,在自注意力机制提取语句上下文信息时加入分词权重信息,添加正确分词有效的边界信息,抑制歧义分词错误的前后文关系.对比MarkBert与W2NER算法,在公开数据集Resume、 MSRA、 Weibo、 OntoNotes中的试验结果表明,歧义分词信息抑制算法的预测准确率、句子长度增加时的鲁棒性、数据集增大时的预测准确率均有更好的表现.Aiming at the problem of unclear sentence vocabulary boundaries and neglected vocabulary and context relationship training,an ambiguous word segmentation information suppression algorithm based on multiple word segmentation situations was designed.The weights of different subwords of the utterance were calculated in the computation based on the pre-trained timing frequency table,the most likely subword cases were distinguished from other subword cases and merged into the utterance,and finally the information of subword weights was added in the independent variable mechanism to extract the contextual information of the utterance;the goal of adding the valid boundary information of the correct subword and the purpose of regulating the symmetric contextual relationship for ambiguous subword errorsr were achieved.A comparison between the MarkBert and W2NER algorithms was made and experiments on the public data sets such as Resume,MSRA,Weibo and OntoNotes showed that the algorithm improved the prediction accuracy and robustness when the sentence length increased,and increased the prediction accuracy when the data set increased.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49