基于随机森林方法的北京市二手房价格研究
Analysis of Beijing Second-Hand House Price Based on Random Forest
作者: 李晓童 , 郭 萱 , 王成杰 :中国石油大学(北京)理学院,北京;
关键词: 二手房; 房价预测; Boostrap抽样; 决策树; 随机森林; Second-Hand House; Housing Forecast; Bootstrap Sampling; Decision Trees; Random Forest Model
摘要:Abstract: With the development of economy and reducing of available land, the price of second-hand house is rising continuously. By the end of May 2016, average price of second-hand house in Beijing has been more than ¥60,000/m2. Evaluating the price of second-hand house will not only produce important influence on residents’ life, but also bring effective reference on the government’s macroeconomic regulation and control. Current mathematical model about housing price includes linear regression model, neural network model (NN) and support vector machine model (SVM). In linear regression model, the suppose of linear relationship may cause more error. NN and SVM are proved to have poor explanatory. Based on the price of 16,795 second-hand houses in Beijing, the random forest model was established to study the influence factors of house price and the forecast of house price. Method of variance explanatory changes shows lat (Residential latitude), long (Residential longitude) and cate (Residential area) are the three main significant prediction variables on housing price, while random forest model picks up cate, lat and long to be the most important. Through analysis of OOB (out-of bag) samples, random forest gets a precision of 0.69 in second-hand housing forecast. Finally, put price data into NN and SVM model and forecast, precision 5.15 and 1.10 were got respectively. The result shows that random forest forecast is the best, followed by SVM. NN prediction does not apply to the second-hand house data in this paper.
文章引用: 李晓童 , 郭 萱 , 王成杰 (2017) 基于随机森林方法的北京市二手房价格研究。 数据挖掘, 7, 37-45. doi: 10.12677/HJDM.2017.72004
参考文献
[1] 仲小瑾. 基于多元线性回归分析法的房地产价格评估[J]. 商业时代, 2014: 133-134.
[2] 李菲, 孙文彬. 灰色理论在商品住宅价格预测中的应用[J]. 辽宁工程大学学报, 2004, 6(3): 271-273.
[3] 张辉. 关于多当今社会BP神经网络的房地产价格评估与研究方向[J]. 房地产导刊, 2013.
[4] 陈静. 基于支持向量机的房地产估价方法研究[D]: [硕士学位论文]. 西安: 长安大学, 2008.
[5] 郭志强. 基于支持向量机回归的房地产批量估价[D]: [硕士学位论文]. 广州: 暨南大学, 2013.
[6] James, G. (2014) An Introduction to Statistical Learning with Applications in R. University of Southern California, 303-324.
[7] 杨沐晞. 基于随机森林模型的二手房价格评估研究[D]: [硕士学位论文]. 长沙: 中南大学, 2012.