检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孔清清 丁香乾[1] 宫会丽[1] 李忠任 唐兴宏 于春霞
机构地区:[1]中国海洋大学信息科学与工程学院,山东青岛266100 [2]云南中烟工业有限责任公司技术中心,云南昆明650024
出 处:《分析测试学报》2017年第10期1203-1207,共5页Journal of Instrumental Analysis
基 金:国家科技支撑计划项目(2015BAF12B01);云南中烟工业有限责任公司项目(JSZX2014YL01;20530001020152000086)
摘 要:针对近红外光谱中的噪声和冗余信息导致分类模型识别率低的问题,提出了随机森林结合博弈论的特征选择算法。该算法首先根据随机森林对特征重要性进行度量,优选出对分类具有一定相关性的特征;然后利用改进的夏普利值结合互信息计算优选特征的权重,从加权后的特征集合中去掉冗余得到最优特征子集。为了验证算法的有效性,将其应用于烟叶产地识别模型,实验结果表明,该文所提出的特征选择算法对烟叶产地识别效果较好,分类识别率可达95.88%。The feature selection algorithm based on the combination of random forest and game theory was proposeed in this paper as noise and redundant information in the near infrared spectroscopy would lead to the low recognition rate of a model. This algorithm was first used to measure the feature significance according to the random forest and select some features related to classification, then compute the weights of selected characters by using the improved Shapley values and mutual informa- tion computed to remove redundant information from the weighted feature set and get the optimal fea- ture subset. To validate effectiveness of this algorithm, the tobacco leaf production area identification model was established. The experimental results indicated that the algorithm proposed in this paper had a good recognition on the area of tobacco leaf production with a recognition rate of 95.88%.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117