融合数据增强与集成学习的IT运维数据分类方法  

IT Operation and Maintenance Data Classification Method Integrating Data Augmentation and Ensamble Learning

在线阅读下载全文

作  者:刘鑫泉 徐建[1] LIU Xinquan;XU Jian(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)

机构地区:[1]南京理工大学计算机科学与工程学院,南京210094

出  处:《计算机与数字工程》2024年第12期3579-3584,共6页Computer & Digital Engineering

摘  要:智能运维的飞速发展对IT运维数据的自动化分类产生了巨大的需求,基于深度学习的文本分类方法取得了比传统机器学习方法更好的效果。然而,对于不平衡数据集的文本分类仍然面临挑战,且单一的神经网络模型无法提取并综合文本中多维度的信息。鉴于此,论文提出了一种融合数据增强与集成学习的IT运维数据分类方法。该方法提出了一种基于TF-IDF关键词提取算法的文本数据增强方法,并通过将少样本类别进行文本数据增强得到相对平衡的训练数据集,此外以TextCNN、TextRCNN和FastText作为基分类器,分别进行训练和预测,将所得概率以软投票法为结合策略进行集成,得到IT运维数据分类模型。理论分析以及实验结果表明,与传统分类方法相比,该分类方法有效解决了数据不平衡问题,取得了良好的分类效果。The rapid development of artificial intelligence for IT operations generated a huge demand for the automatic classifi-cation of IT operation and maintenance data.The text classification method based on deep learning has achieved better results than the traditional machine learning method.However,the text classification of unbalanced data sets still faces challenges,a single neu-ral network model can not extract and synthesize the multi-dimensional information in the text.In view of this,the paper proposes an IT operation and maintenance data classification method integrated data enhancement and ensamble learning.This method pro-poses a text data augmentation method based on TF-IDF keyword extraction algorithm,and a relatively balanced training data set is obtained by text data enhancement of small sample categories.For the more,TextCNN,TextRCNN and FastText are used as the base classifiers for training and prediction respectively.The obtained probability is integrated by the softvoting method to obtain the IT operation and maintenance data classification model.Theoretical analysis and experimental results show that compared with tradi-tional classification methods,this classification method effectively solves the problem of data imbalance and achieves better classifi-cation results.

关 键 词:文本分类 智能运维 数据增强 深度学习 集成学习 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术] TN9[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象