检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张朦 刘忠宝[1,2] ZHANG Meng;LIU Zhong-Bao(School of Software,North University of China,Taiyuan 030051,China;Institute of Language Intelligence,Beijing Language and Culture University,Beijing 100083,China)
机构地区:[1]中北大学软件学院,太原030051 [2]北京语言大学语言智能研究院,北京100083
出 处:《计算机系统应用》2023年第3期300-308,共9页Computer Systems & Applications
基 金:教育部哲学社会科学研究后期项目(21JHQ081)。
摘 要:近年来,数字人文受到广泛关注,数字人文环境下的词命名实体识别研究日渐兴起,但鲜有研究从字特征的特征表示能力、分词的准确性、领域知识的有效性等方面进行探究.鉴于此,针对汉字的象形文字特点和词文本的特殊性,在字特征的基础上,引入部首特征、格律特征和声韵特征,提出特征增强单元和特征抽取单元,并将词牌知识三元组通过ANALOGY得到的知识向量表示为词牌知识向量,通过双向长短时记忆网络、注意力机制等模型将部首向量、字向量、格律向量、声韵向量、词牌知识向量进行深度融合,最终构建出融入多特征的词命名实体识别方法.在《花间集全译》自制语料上的对比实验和消融实验的结果表明,本文所提方法能够有效利用多特征提升词命名实体识别性能.其F1值达到了85.63%,完成了词命名实体识别任务.In recent years,research on the named entity recognition of poetry in digital humanities is emerging,but few studies have been conducted with regard to the feature expressiveness of character features,word segmentation accuracy,and the effectiveness of domain-specific knowledge in poetry texts.According to the characteristics of Chinese pictographs and the particularity of poetry texts,a recognition method of named poetry entities with a feature enhancement unit and a feature extraction unit is proposed,which integrates multiple features such as characters,radicals,sounds,and metrical rules.The method presents the knowledge vectors obtained from the knowledge triples of tune pattern titles through the ANALOGY model as the knowledge vectors of tune pattern titles.Then,the radical vector,character vector,metrical rule vector,sound vector,and knowledge vector of tune pattern titles are deeply fused through the bidirectional long short-term memory network and attention mechanism models.In this way,the recognition method of named poetry entities fusing multi-features is constructed.The results of comparative experiments and ablation experiments on the self-made corpus of Translation of Among Flowers(Hua Jian Ji)(《花间集全译》)show that the proposed method can effectively use multi-features to improve the recognition performance of named entities,and its F1 score reaches 85.63%,which means it completes the recognition task of named poetry entities.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229