基于Transformer融合词性特征的中文语法纠错模型被引量：2

Chinese grammatical error correction model based on Transformer fused with part-of-speech feature

作　　者：尚海怡黄继风[1] 陈海光[1] SHANG Haiyi;HUANG Jifeng;CHEN Haiguang(College of Information,Mechanical and Electrical Engineering,Shanghai Normal University,Shanghai 201418,China)

机构地区：[1]上海师范大学信息与机电工程学院,上海201418

出　　处：《计算机应用》2022年第S02期25-30,共6页journal of Computer Applications

基　　金：上海市地方能力建设项目(19070502900)。

摘　　要：针对中文同一个词的不同词性在句子中所代表的关系不同的问题,提出基于Transformer融合词性特征的中文语法纠错(CGEC)模型,所提模型将语言学知识作为辅助信息融入中文语法纠错任务。首先,在不改变句子序列长度的基础上,在原始词嵌入层中以不同方式拼接词性向量,得到全差异词嵌入、词差异词嵌入和词性差异词嵌入三种不同的词嵌入方式;然后,将新的词嵌入方式与Transformer模型相结合,对错误语句进行语法纠错。实验结果表明,三种词嵌入方式均不同程度地提高了F0.5值,且全差异词嵌入方式的效果最好:与Transformer模型相比,F0.5提升了2.73个百分点,BLEU提升了6.27个百分点;与基于Transformer增强架构的中文语法纠错模型相比,F0.5提升了1.88个百分点。所提模型在对词性特征提取时可以侧重源语句与目标语句的语法差异,更好地捕捉句子的语法特征。Aiming at the problem that different part-of-speech of the same Chinese word represents different relationships in sentences,a Chinese Grammatical Error Correction(CGEC)model based on Transformer fused with part-ofspeech feature was proposed.Linguistic knowledge was incorporated as auxiliary information into Chinese grammatical error correction tasks by the proposed model.First,without changing the length of the sentence sequence,the part-of-speech vectors were spliced in different ways in the original word embedding layer to obtain full-difference word embedding,worddifference word embedding and part-of-speech-difference word embedding.Then,the new word embedding methods were combined with the Transformer model to perform grammatical error correction on wrong sentences.The experimental results show that the three word embedding methods improves the F0.5value to varying degrees,and the full-difference word embedding has the best effect.Compared with the Transformer model,the F0.5value of the full-difference word embedding increases by 2.73 percentage points and BLEU(Bilingual Evaluation Understudy)increases by 6.27 percentage points.Compared with the Chinese grammatical error correction model based on the Transformer enhanced architecture,F0.5increases by 1.88 percentage points.The proposed algorithm enables the model to focus on the grammatical differences between the source sentence and the target sentence when extracting part-of-speech features,so as to better capture the grammatical features of sentences.

关键词：中文语法纠错语言学知识词嵌入 Transformer模型解码器

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Transformer融合词性特征的中文语法纠错模型被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Transformer融合词性特征的中文语法纠错模型 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Transformer融合词性特征的中文语法纠错模型被引量：2