机构地区:[1]长安大学信息工程学院,陕西西安710064 [2]陕西省交通运输厅,陕西西安710075
出 处:《长安大学学报(自然科学版)》2018年第5期205-212,共8页Journal of Chang’an University(Natural Science Edition)
基 金:陕西省自然科学基础研究计划项目(2017JQ5014);陕西省交通运输科研项目(15-45r)
摘 要:为准确全面感知高速公路交通运行状况,根据高速公路海量收费数据,提出一种高速公路通行异常事件识别的数据挖掘方法。首先,选取贵州省2017年1月的高速公路收费数据,筛选指定的进站、出站数据并去除多余字段,利用车辆进入和驶出收费站时间计算其在该路段的通行时长。然后,使用快速峰值聚类算法对通行时长和车辆总重进行聚类分析,计算数据间欧式距离,将此距离矩阵作为算法输入,计算各数据点的局部密度ρ及与密度更高点的距离δ两项指标;这两项指标均以较高的点为聚类中心,进而对非中心点进行分类及优化,输出聚类结果;聚类结果中除被分为若干类的正常数据外,还存在一些数据点明显异于大部分正常数据的噪声点,即异常数据,对这些异常数据进行具体分析。接着,采用孤立点检测法对筛选出的数据进行清洗处理,提取异常数据,检测出通行时间过长、过短及车辆总重过高、过低等异常事件。最后,将孤立点检测法得到的异常数据与快速峰值聚类算法的异常数据进行对比。研究结果表明:快速峰值聚类识别异常事件的准确率高于孤立点检测法约20%,验证了提出算法的有效性和准确性;提出的算法能有效准确识别收费数据中隐藏的公路拥堵、长时间停留、疑似逃费和网络设备故障等异常事件,进而为高速公路运营服务和管理决策提供数据支持。To sense the expressway traffic operation-status more accurately and comprehensively,a data mining method for identifying abnormal traffic events on an expressway using mass data collection was proposed.First,fee data from January 2017 were selected from the massive data available for the Guizhou Expressway toll.The data on the specific entrance and exit stations were selected,and some redundant fields were deleted,with those data only related to this study being retained.The time for driving into the entrance station and driving out of the exit stationwas used to calculate the vehicle staying time between the two toll stations.The selected data were analyzed based on the driving time and axle weight using a fast peak clustering algorithm.The distance between each data point was calculated,and the distance matrix was used as the input of the algorithm.The local density of each data point and the distance between the points with higher density were calculated.In addition,the cluster centers were selected based on the principle that the two indicators were higher.The non-central points were classified and optimized,and the clustering result was then outputted.The normal data of clustering results were divided into several categories,and there exists some noise whose data points were significantly different from most of the normal data.A specific analysis was conducted for these abnormal data.An outlier detection algorithm was then used to process the original data,the cleaned abnormal data were extracted,and abnormal events such as excessive transit time,a short transit time,and a high load were detected.Finally,the anomalies in the data obtained using the isolated point detection method were compared with the anomalies in the data of the fast peak clustering algorithm.The results show that the accuracy of fast peak clustering used to identify anomalous events is higher than that of the isolated point detection method by nearly20%,which verifies the validity and accuracy of the proposed algorithm.The method propose
关 键 词:交通信息与控制工程 智能交通 异常事件分析 快速峰值聚类 孤立点检测 高速公路收费数据 数据挖掘
分 类 号:U491[交通运输工程—交通运输规划与管理] TP301.6[交通运输工程—道路与铁道工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...