混合BERT和宽度学习的低时间复杂度短文本分类被引量：1

Low time complexity short text classification based on fusion of BERT and broad learing

作　　者：陈晓江杨晓奇陈广豪刘伍颖 CHEN Xiaojiang;YANG Xiaoqi;CHEN Guanghao;LIU Wuying(Information Department,Jieyang Campus of Guangdong Open University,Jieyang 522095,Guangdong,China;School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou 510006,Guangdong,China;Department of Software Engineering,Software Engineering Institute of Guangzhou,Guangzhou 510990,Guangdong,China;Shandong Key Laboratory of Language Resources Development and Application,Ludong University,Yantai 264025,Shandong,China;Center for Linguistics and Applied Linguistics,Guangdong University of Foreign Studies,Guangzhou 510420,Guangdong,China)

机构地区：[1]广东开放大学揭阳分校信息科,广东揭阳522095 [2]广东外语外贸大学信息科学与技术学院,广东广州510006 [3]广州软件学院软件工程系,广东广州510990 [4]鲁东大学山东省语言资源开发与应用重点实验室,山东烟台264025 [5]广东外语外贸大学外国语言学及应用语言学研究中心,广东广州510420

出　　处：《山东大学学报（工学版）》2024年第4期51-58,66,共9页Journal of Shandong University（Engineering Science）

基　　金：教育部新文科研究与改革实践资助项目(2021060049);教育部人文社会科学研究青年基金资助项目(20YJC740062);教育部人文社会科学研究规划基金资助项目(20YJAZH069);山东省研究生教育教学改革研究资助项目(SDYJG21185);山东省本科教学改革研究重点资助项目(Z2021323);上海市哲学社会科学“十三五”规划课题资助项目(2019BYY028);广州市科技计划资助项目(202201010061)。

摘　　要：针对短文本分类任务效率低下和精度不高的问题,提出混合基于Transformer的双向编码器表示和宽度学习分类器(hybrid bidirectional encoder representations from transformer and broad learning, BERT-BL)的高效率和高精度文本分类模型。对基于Transformer的双向编码器表示(bidirectional encoder representation from transformer, BERT)进行微调以更新BERT的参数。使用微调好的BERT将短文本映射成对应的词向量矩阵,将词向量矩阵输入宽度学习(broad learning, BL)分类器中以完成分类任务。试验结果显示,BERT-BL模型在3个公共数据集上的准确率均达到最优;所需要的时间仅为基线模型支持向量机(support vector machine, SVM)、长短期记忆网络(long short-term memory, LSTM)、最小p范数宽度学习(minimum p-norm broad learning,p-BL)和BERT的几十分之一,而且训练过程不需要高性能显卡的参与。通过对比分析,BERT-BL模型不仅在短文本任务中具有良好的性能,而且能节省大量训练时间成本。To address the issues of low efficiency and low accuracy in short text classification(STC)tasks,a high-efficiency and high-precision text classification model was proposed that combined transformer based on bidirectional encoder representations and broad learning classifiers(BERT-BL).Through the process of fine-tuning the bidirectional encoder representation from transformer(BERT)based on transformer,the parameters of BERT could be updated to optimize its performance.Utilized fine-tuned BERT to map the short text to its respective word vector matrix,then input it into the BL classifier to classify.The experimental results showed that the accuracy of the BERT-BL model reached state-of-art performance on three public datasets respectively. The mainfinding was that the BERT-BL model took only a few tenths of the time required to baseline models of support vector machine( SVM), long short-term memory (LSTM), minimum p-norm broad learning (p-BL) and BERT, and its training process did notrequire the participation of a graphics processing unit. Through comparative analysis, the BERT-BL model not only had goodperformance in STC, but also can save a lot of training time cost.

关键词：短文本分类 BERT-BL BERT 宽度学习高精度

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

混合BERT和宽度学习的低时间复杂度短文本分类被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

混合BERT和宽度学习的低时间复杂度短文本分类 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

混合BERT和宽度学习的低时间复杂度短文本分类被引量：1