检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:章巍 陈学奇 韩剑锋 虞小江 吴海燕 ZHANG Wei;CHEN Xueqi;HAN Jianfeng;YU Xiaojiang;WU Haiyan(Zhejiang Zheneng Linhai Offshore Wind Power Company Limited,Linhai Zhejiang 317000,China;Hangzhou Blue City New Energy Technology Company Limited,Hangzhou Zhejiang 310000,China;School of Information Management and Artificial Intelligence,Zhejiang University of Finance&Economics,Hangzhou Zhejiang 310018,China)
机构地区:[1]浙江浙能临海海上风力发电有限公司,浙江临海317000 [2]杭州蓝城新能源科技有限公司,杭州310000 [3]浙江财经大学信息技术与人工智能学院,杭州310018
出 处:《计算机应用》2024年第S01期11-17,共7页journal of Computer Applications
基 金:国家自然科学基金资助项目(62306267);浙江省自然科学基金资助项目(LY22F020027)。
摘 要:句子分类方法主要分为基于特征工程的机器学习方法、序列化模型和结构化模型,但基于特征工程的机器学习方法对词序不敏感易产生稀疏向量,序列化模型忽略了句子的短语、依存关系等句法结构信息,结构化模型如句法树、二叉树等的准确率受句法解析工具影响。针对上述问题,构建基于句法CYK(Cocke Younger Kasami)图神经网络(GNN)的知识增强文本分类模型S-CYK,对输入句子分别构建对应的短语树和CYK图以形成句法CYK图,并利用关系图注意力网络(RGAT)进行句子分类。在公共数据集AG’s News、DBpedia、ARP(Amazon Review Polarity)和ARF(Amazon Review Full)上的实验结果表明,与现有先进模型半监督变分自编码器(SSVAE)、对抗性微调BERT(AFTB)、基于GloVe的ABLSTM(GloVe+ABLSTM)和融合FastText的CNN(CNN with FastText)相比,S-CYK模型在4个数据集的准确率提升了0.04%~1.21%。S-CYK使用句法CYK图结构进行知识增强,能有效增强聚合句子信息的能力。Sentence classification is mainly categorized into feature engineering-based machine learning methods,serialized models and structured models,but they have different shortcomings,such as feature engineering-based machine learning methods are insensitive to word order and tend to generate sparse vectors;the serialization models ignore syntactic structure information such as phrases and dependencies;the accuracies of structured models such as syntactic tree and binary tree are affected by the parsing toolkit.To solve the above problems,a knowledge-enhanced text classification model based on Syntactic CYK(Cocke Younger Kasami)Graph Neural Network(GNN)called S-CYK was constructed.The corresponding phrase tree and CYK graph of the input sentence were constructed respectively to form a syntactic CYK graph,and Relational Graph ATtention network(RGAT)was used to classify the sentence.Experimental results on AG’s News,DBpedia,ARP(Amazon Review Polarity)and ARF(Amazon Review Full)demonstrate that the accuracy of S-CYK is improved by 0.04%to 1.21%respectively compared with four state-of-the-art models including Semi-Supervised Variational AutoEncoder(SSVAE),Adversarial Fine-Tuning Bidirectional Encoder Representations from Transformers(AFTB),GloVe+ABLSTM and CNN with FastText on the four datasets.S-CYK uses syntactic CYK graph structure for knowledge enhancement,which significantly enhances the ability to aggregate sentence information.
关 键 词:句法知识 CYK算法 知识增强 图神经网络 文本分类
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7