基于XGBoost的在线短租市场价格预测及特征分析模型  被引量:19

Predicting Prices and Analyzing Features of Online Short-Term Rentals Based on XGBoost

在线阅读下载全文

作  者:曹睿 廖彬 李敏[1,2] 孙瑞娜[1,3,4] Cao Rui;Liao Bin;Li Min;Sun Ruina(College of Statistics and Data Science,Xinjiang University of Finance&Economics,Urumqi 830012,China;School of Information Science and Engineering,Xinjiang University,Urumqi 830008,China;Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China;School of Cyber Security,University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区:[1]新疆财经大学统计与数据科学学院,乌鲁木齐830012 [2]新疆大学信息科学与工程学院,乌鲁木齐830008 [3]中国科学院信息工程研究所,北京100093 [4]中国科学院大学网络空间安全学院,北京100093

出  处:《数据分析与知识发现》2021年第6期51-65,共15页Data Analysis and Knowledge Discovery

基  金:国家自然科学基金项目(项目编号:61562078);新疆天山青年计划项目(项目编号:2018Q073)的研究成果之一。

摘  要:【目的】解决不同特征的房源缺乏合理定价建议的问题。【方法】基于Airbnb平台真实的营业数据,提出一种基于XGBoost的在线短租市场价格预测及特征分析模型。利用Lasso对原始数据进行特征提取并降维,再将特征提取后的数据作为XGBoost的输入,迭代训练获得最佳的预测模型,最后利用SHAP值对模型特征进行解释。【结果】实验结果表明,基于XGBoost的在线短租市场价格预测模型在调优超参数后,RMSE、MAE和R-squared分别能够达到0.091、0.065和0.798,优于4种主要的对比模型。【局限】由于数据源限制,模型训练数据未能与实时在线的业务数据流特征结合,可能导致模型实时适应能力偏弱。【结论】引入SHAP模型增强模型的可解释性,综合XGBoost与RandomForest的特征重要性排序结果,识别出影响房价的关键因素,为房东改进服务质量并提高收益提供决策参考。[Objective]This paper proposed a model to predict prices and analyze properties of online short-term rentals based on XGBoost,aiming to address the issue of lacking reasonable pricing suggestion mechanism for housing with different characteristics.[Methods]We collected data from the Airbnb platform and used Lasso to extract features from these raw data as well as reduced their dimensions.Then,we input the extracted data to XGBoost and iteratively trained the prediction model.Finally,we used the SHAP value to interpret the model features.[Results]The RMSE,MAE and R-squared values of the proposed model were 0.091,0.065 and 0.798 respectively after tuning the hyperparameters,which were better than those of the four existing models.[Limitations]Our new model could not merge the features of real-time online business data,which influenced the prediction accuracy.[Conclusions]The proposed model has good interpretability,and could identify the key factors affecting housing prices,which helps the landlords improve services.

关 键 词:机器学习 定价模型 在线短租 XGBoost模型 SHAP值 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象