检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]长安大学电子与控制工程学院,西安710064 [2]长安大学公路学院,西安710064
出 处:《交通运输系统工程与信息》2016年第3期81-87,共7页Journal of Transportation Systems Engineering and Information Technology
基 金:973计划项目(2013CB036003);国家自然科学基金项目(51408054);中央高校基本科研业务费专项资金项目(310832161006)~~
摘 要:对隧道内环境、交通状态等各类运营数据的实时、完整获取并深入挖掘,是提高应急处置能力、实现运营安全预警的基础.提出一种基于随机森林的缺失数据插补方法,根据缺失特征对缺失数据集进行分割;建立随机森林回归模型进行迭代插补并确定迭代终止条件;以标准均方根误差最小确定了随机森林中决策树的数量和分裂节点随机抽取变量数的最优组合.对公路隧道运营缺失数据集插补结果表明:本方法插补精度高、鲁棒性好,与KNN、SVD、MICE和PPCA等插补方法相比,标准均方根误差降低25%以上;利用并行运算大幅度提高了插补效率,弥补了插补速度慢的缺陷,保证了插补的有效性和时效性.Real-time & completely accessing and deeply mining of tunnel operational data such as environment state and traffic status is a foundation work to improve emergency response capacity and realize safety early warning. An imputation method is proposed based on Random Forest algorithm. Missing data set is separated according to missing features. Random Forest regression model is built to iteratively impute after the determination of stopping criterion. The optimal combination of decision tree numbers and variables numbers randomly sampled at each split in Random Forest are identified by taking the minimum normalized root mean square error as objective function. Imputation results on highway tunnel operational missing data indicate that the method provides significantly higher precision and better robustness than KNN, SVD, MICE, PPCA, reducing normalized root mean square error by at least 25%. Moreover, the imputation efficiency is improved significantly by using parallel computation. It covers the shortage of slow imputation speed and provides a warranty of effectiveness and timeliness in missing data imputation.
关 键 词:公路运输 缺失数据插补 随机森林 公路隧道 运营管理
分 类 号:U491[交通运输工程—交通运输规划与管理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.173