近似到达时间约束下的语义轨迹频繁模式挖掘  被引量:6

Frequent Pattern Mining With Approximate Arrival-Time in Semantic Trajectories

在线阅读下载全文

作  者:吴瑕 唐祖锴[3] 祝园园[1] 彭煜玮 彭智勇[2] WU Xia;TANG Zu-Kai;ZHU Yuan-Yuanl;PENG Yu-Wei;PENG Zhi-Yong(State Key Laboratory of Software Engineering(Wuhan University),Wuhan 430072,Chin;Computer School,Wuhan University,Wuhan 430072,China;School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430070,China)

机构地区:[1]软件工程国家重点实验室(武汉大学),湖北武汉430072 [2]武汉大学计算机学院,湖北武汉430072 [3]武汉理工大学计算机科学与技术学院,湖北武汉430070

出  处:《软件学报》2018年第10期3184-3204,共21页Journal of Software

基  金:科技部国家重点研发计划(2016YFB1000700);国家自然科学基金(61502349)~~

摘  要:随着GPS定位技术的不断发展与智能移动设备的普及,轨迹数据的获取变得越来越容易,同时,轨迹数据相关应用的需求也逐渐增多.在轨迹数据上加入语义信息,可以得到体积较小、质量较高、能够更好地反映用户行为的语义轨迹,在其上实现旅游线路推荐、路线预测、用户生活模式挖掘、朋友推荐等应用,可以更好地满足用户需求.挖掘语义轨迹的频繁模式是实现这些应用的技术基础,而在很多情况下,用户对语义轨迹频繁模式常存在到达时间方面的需求,比如按特定时间游玩热门景点的同时需要按时到达车站候车.现有的语义轨迹模式挖掘方法大多没有考虑到达时间的约束,挖掘出的频繁模式缺少到达时间信息;少数方法考虑了精确的到达时间,但因为约束太强会导致无法挖掘到频繁的模式.因此,首次对近似到达时间约束下的语义轨迹频繁模式(approximatearrival-time constrained frequent pattern,简称AAFP)挖掘方法进行了研究,并给出了其形式化定义;通过时间轴划分提出了挖掘AAFP的基线算法,并通过建立索引AAP-tree提出了改进后的高效、灵活的AAFP挖掘算法;之后提出了信息熵增量公式,并给出了时间轴划分及AAP-tree的高效维护方法;最后在真实数据集上进行实验,验证了方法的有效性及高效性.Along with the development of the GPS positioning technology and smart mobile devices, more and more trajectory data are collected continuously every day. Thus, managing and mining useful information from these trajectories is critical in many application areas. Compared with raw trajectory data, semantic trajectory data equipped with semantic information has better quality, less volume and higher description ability, and thus it can be used in many applications such as trip recommendation, next location prediction, life pattern understanding, and friend recommendation. Mining frequent pattern in semantic trajectories is the fundamental problem in above tasks. In many circumstances, users may have the requirements on the arrival-time, e.g., users may want to visit a popular view spot at a certain timestamp and then arrive the railway station on time. Most of existing approaches on semantic trajectory pattern mining do not consider the arrival-time, and only a few existing approaches take the accurate arrival-time as the constraint, but they can barely find frequent patterns under such a strict time constraint. This paper, for the first time, studies the approximate arrival-time constrained frequent pattern (AAFP) mining problem. First, a baseline algorithm of mining AAFP is given by dividing the time axis into intervals. Then, an improved flexible algorithm is proposed to significantly improve the efficiency based on the AAP-tree index. Finally, a strategy to maintain the AAP-tree and the set of time axis partitions is introduced based on incremental information entropy. The experimental results on real trajectory datasets validate the effectiveness and efficiency of the proposed algorithms.

关 键 词:轨迹数据 语义轨迹 近似到达时间 轨迹频繁模式 频繁模式挖掘 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象