检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张俊逸 贾文珏 孙中孝 张倩[1,2] ZHANG Junyi;JIA Wenjue;SUN Zhongxiao;ZHANG Qian(Key Laboratory of Urban Land and Resources Monitoring and Simulation,Ministry of Natural Resources,Shenzhen,Guangdong 518000,China;College of Land Science and Technology,China Agricultural University,Beijing 100193,China;Information Center of the Ministry of Natural Resources,Beijing 100812,China)
机构地区:[1]自然资源部城市国土资源监测与仿真重点实验室,广东深圳518000 [2]中国农业大学土地科学与技术学院,北京100193 [3]自然资源部信息中心,北京100812
出 处:《测绘科学》2024年第10期156-165,共10页Science of Surveying and Mapping
基 金:自然资源部城市国土资源监测与仿真重点实验室开放基金项目(KF-2021-06-113)。
摘 要:针对不动产登记数据质量存在的问题,本文探讨了不动产登记数据库中的住宅交易数据的处理与分类方法。以S市住宅交易价格指标为例,形成了基于核密度估计等统计学方法为基础的数据清洗、处理、分类的技术方法。(1)在剔除极端值、重复值和特殊值后获得与地区实际最相近的主体数据,主体数据包括低概率数据、市场行为数据、非市场行为数据。(2)针对市场行为数据,提取并计算交易价格均值,与中介机构公开房价数据信息进行比对,大部分区域的数据差异在15%以下。定量分析证实不动产登记数据库中数据更加权威有效。本研究探索提出基于核密度估计与二阶差分相结合进行不动产登记数据处理分类的技术流程,可为分析挖掘全国不动产登记数据信息并进行量化监管提供方法基础。We aim to analyze the existing issues in the national real estate registration database,and then construct a method to improve the quality of big data.and assessed the effectiveness of this method.we employed statistical methods such as kernel density estimation and residential registration data of S city in the national real estate registration data to identify extreme values and duplicate values in residential prices and classify the cleaned data in S city.(1)We categorize the data according to the distribution condition of the data,firstly eliminating extreme values,then eliminating the duplicate values and special values in the valid data to obtain the subject data,where the subject data includes low probability data,market behavior data,and non-market behavior data.(2)For the market behavior data,the average value of transaction price is extracted and calculated,and compared with the information of public house price data of intermediary institutions,the difference of the data in most regions is less than 15%,the quantitative analysis confirms that the data in the real estate registration database is more authoritative and effective.This study built up a data quality improvement method based on kernel density estimation to identify extreme values and duplicate values in the real estate registration data.Our results verified that the method of data quality improvement is robust and effective.The improvement of registration data quality provided a methodological basis,which can provide more accurate data resources for the application of national real estate registration data.
关 键 词:数据质量提升 核密度估计 不动产登记 城市住宅价格
分 类 号:P273[天文地球—测绘科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49