检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:叶丽珠 郑冬花 刘月红 牛少华[4] YE Lizhu;ZHENG Donghua;LIU Yuehong;NIU Shaohua(School of Information Technology and Engineering,Guangzhou College of Commerce,Guangzhou 511363,China;Graduate School,Management and Science University,Shah Anam 40100,Malaysia;College of Information Science and Engineering,Guilin University of Technology,Guilin 541004,China;School of Mechanical and Electrical Engineering,Beijing Institute of Technology,Beijing 100081,China)
机构地区:[1]广州商学院信息技术与工程学院,广东广州511363 [2]马来西亚管理与科学大学研究生院,雪兰莪州莎阿南市401003 [3]桂林理工大学信息科学与工程学院,广西桂林541004 [4]北京理工大学机电学院,北京100081
出 处:《南京邮电大学学报(自然科学版)》2022年第6期99-105,共7页Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基 金:国家自然科学基金(61961010);广东省高等学校特色专业建设项目(2020SJTSZY01);广东省“十四五”规划高等教育研究课题(21GYB08);广东省普通高校特色创新类项目(2021KTSCX150);广西省自然科学基金青年基金(2018GXNSFBA050029)资助项目。
摘 要:为了提高非平衡数据分类的准确性,采用随机森林算法用于数据分类,并结合鲸鱼优化算法对随机森林弱分类器权重进行优化求解,以增强随机森林算法对非平衡数据分类的适应性。首先,建立基于随机森林的非平衡数据分类模型。通过随机森林的多个决策树弱分类器进行分类,有效解决样本不均衡导致的分类困难问题。接着,采用鲸群优化算法对弱分类器权重进行优化求解,将分类准确率均值作为鲸群优化适应度函数,以提高弱分类器权重投票对最终分类结果的精度。最后,采用经过鲸群优化得到的随机森林模型进行非平衡数据分类。实验证明,通过合理设置鲸群优化算法参数,可以获得分类准确度更高的随机森林弱分类器权重,相较于常用非平衡数据分类算法,文中算法能够获得更优的分类性能。In order to improve the accuracy of unbalanced data classification,the random forest algorithm is used for data classification,and the whale optimization algorithm is adoped to optimize the key parameters of the random forest,thus the adaptability of the random forest algorithm to unbalanced data classification is enhanced.First,the unbalanced data classification model is developed based on the random forest.The classification difficulties caused by sample imbalance are effectively solved through multiple decision tree weak classifiers of the random forest.Second,the whale swarm optimization algorithm is deployed to optimize the weight of weak classifiers,and the average classification accuracy is taken as the fitness function of the whale swarm optimization.Thus the accuracy of the weak classifier weight voting on the final classification results.Finally,the random forest model optimized by the whale population is used to classify the unbalanced data.Experiments show that by reasonably setting the parameters of the whale swarm optimization algorithm,the weight of random forest weak classifiers with higher classification accuracy can be obtained.Compared with the unbalanced data classification algorithms,this algorithm can obtain better classification performance.
关 键 词:非平衡数据分类 随机森林 鲸群优化算法 弱分类器 决策树
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15