检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张浩[1] 朱晨龙 ZHANG Hao;ZHU Chen-long(College of Economics and Management,Jiangsu University of Science and Technology,Zhenjiang 212000,China)
机构地区:[1]江苏科技大学经济管理学院,江苏镇江212000
出 处:《软件导刊》2020年第8期1-5,共5页Software Guide
基 金:国家自然科学基金重点项目(71331003)。
摘 要:为解决单一特征选择方法的局限性问题,提出Lasso-RF(LRF)混合特征选择方法,并应用于在线短租房源价格问题研究。基于Airbnb房源数据,实验首先通过Lasso回归进行特征选择,处理特征之间的多重共线性;然后采用随机森林算法精选剩余特征,最终得到35个重要特征,并带入4个预测模型中进行比较。结果表明,特征之间的多重共线性会影响随机森林算法对特征重要度的度量;LRF-RF预测模型与RF-RF预测模型相比,评价指标R2和MSE分别提高了0.005、0.006,同时运行时间缩短0.267秒,表明LRF混合特征选择方法优于单一的RF特征选择方法。To solve the problem of the limitation of single feature selection method,a mixed feature selection method for Lasso-RF(LRF)is proposed,and is applied to the listings price of home-sharing accommodation.Based on the data of Airbnb,the experiment does the feature selection by Lasso regression firstly,dealing with the multicollinearity between features.Then the experiment selects the residual features by Random forest.Finally,35 important features are selected out and used in four prediction models in order to evaluate and compare the results.The results show that the multicollinearity between the features will affect the measurement of the im⁃portance of the random forest to the features.Comparison between LRF-RF prediction model and RF-RF prediction model shows that evaluation indexes R2 and MSE was increased by 0.005 and 0.006 respectively,and the running time was reduced by 0.267 seconds.The evaluation result show that LRF hybrid feature selection method is better than single RF feature selection method.
关 键 词:特征选择 Lasso 随机森林 在线短租 房源价格
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38