融合学科知识的数学习题知识点自动标注模型

Automatic Annotation of Mathematical Exercise Topics Based on Subject Knowledge

作　　者：罗文兵罗凯威[2] 黄琪王明文 LUO Wenbing;LUO Kaiwei;HUANG Qi;WANG Mingwen(Management Science Engineering Research Centre,Jiangxi Normal University,Nanchang,Jiangxi 330022,China;School of Digital Industry,Jiangxi Normal University,Shangrao,Jiangxi 334000,China;School of Computer and Information Engineering,Jiangxi Normal University,Nanchang,Jiangxi 330022,China)

机构地区：[1]江西师范大学管理科学与工程研究中心,江西南昌330022 [2]江西师范大学数字产业学院,江西上饶334000 [3]江西师范大学计算机信息工程学院,江西南昌330022

出　　处：《中文信息学报》2024年第4期143-155,共13页Journal of Chinese Information Processing

基　　金：国家自然科学基金(62266023);江西省教育厅科学技术研究项目(GJJ210325,GJJ2200354)。

摘　　要：习题知识点标注是构建结构化题库和实现个性化学习的关键任务。对于数学习题,由于其存在公式化、表达精炼化等特殊性,现有的标注模型无法很好地捕获关键信息,进而难以深入理解文本中蕴含的深层语义。此外,结合领域知识的知识点标注模型普遍存在引入的知识不够关键、融合的方式过于直接的问题,缺乏对信息的有效筛选,从而导致在特征融合时产生大量噪声,干扰模型最终的标注结果。为此,该文提出了一种融合学科知识的数学习题知识点自动标注模型MKA Gated。该模型首先利用预训练模型对原始习题和两种细化的学科知识文本进行初步的语义编码表示,然后利用注意力机制实现习题与两种学科知识的信息交互以获取两种学科知识的深层语义表征,最后通过门控机制连续地、隐式地融合两种深层语义表征的平均池化表示以保留原始习题表示中有利于最终分类的语义特征。模型在自建的初中数学习题知识点标注数据集上测试的三种指标micro-F_(1)、macro-F_(1)、weighted-F_(1)相较于基准模型分别提升了1.99%、2.99%、2.12%,实验结果表明,该文所提方法能有效提升数学习题知识点的标注。Annotation of mathematical exercise topics is an essential task for building a structured exercise bank or realizing personalized learning.Due to the particularity of mathematical exercise texts,existing annotation models cannot capture deep key information well,and there are generally problems such as insufficient key knowledge introduced,overly direct fusion methods,and a lack of effective screening of information.This paper proposes a model MKA Gated for automatic annotation of mathematical exercise topics.The model first uses the pre-trained model to represent the original exercise and two kinds of refined subject knowledge texts.Then,the attention mechanism is adopted to capture the interaction between the exercise and the two subject knowledge texts as the deep representations.Finally,a gated mechanism is applied to implicitly fuse the average pooling of the two deep representations to preserve the actual effective semantic features in the original exercise representation.Experimented on the self-built junior middle school mathematics exercise dataset,the proposed method outperformed the baseline by 1.99%,2.99%and 2.12%according to micro-F_(1),macro-F_(1) and weighted-F 1,respectively.

关键词：知识点标注学科知识注意力机制门控机制

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合学科知识的数学习题知识点自动标注模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合学科知识的数学习题知识点自动标注模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索