检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘明[1] 李忠任 张海涛 于春霞 唐兴宏 丁香乾[1]
机构地区:[1]中国海洋大学信息科学与工程学院,山东青岛266100 [2]云南中烟工业有限责任公司技术中心,云南昆明650024
出 处:《激光与光电子学进展》2017年第10期449-456,共8页Laser & Optoelectronics Progress
基 金:国家科技支撑计划(2015BAF12B01);云南中烟工业有限责任公司项目(JSZX2014YL01;20530001020152000086)
摘 要:针对随机森林(RF)在高维空间特征选择过程中计算繁琐和内存开销大、分类准确率低等问题,提出了基于二分搜索(BS)结合修剪随机森林(RFP)的特征选择算法(BSRFP);该算法首先根据纯度基尼指数获取特征重要性评分,删除重要性评分较低的特征,然后利用BS算法结合基分类器差异性的修剪技术得到最优特征子集和最高分类准确率的分类器;为了验证算法的有效性,构建卷烟质量识别模型并与其他方法进行比较。结果表明:BS算法简化了特征搜索过程,RFP算法缩减了RF算法的规模;RFP算法的分类准确率可达96.47%;BSRFP算法选择出的特征相关性更强,对卷烟质量识别具有更高的准确度。In view of the problems of the random forest in the feature selection process in high-dimensional spaces, such as calculation complexity, large model memory overhead, and low classification accuracy, a feature selection algorithm named binary search random forest pruning (BSRFP) is proposed. This algorithm firstly obtains the feature importance scores according to the purity Gini index, and deletes features with low importance scores. The optimal feature subset and the classifier with the highest classification accuracy are then obtained with utilization of the pruning technique combining binary search with the diversity among base classifiers. To verify the effectiveness of this algorithm, a cigarette quality recognition model is established and compared with other methods. The results show that the binary search algorithm simplifies the feature search process, and the RFP algorithm reduces the size of random forest algorithm. The classification accuracy of the random forest pruning algorithm is 96.47%. The features selected by using BSRFP algorithm are more correlated, and the algorithm provides higher accuracy of cigarette quality recognition.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15