检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]电子工程学院,合肥230037
出 处:《计算机科学》2016年第12期218-222,共5页Computer Science
基 金:国防重点实验室基金资助
摘 要:根据LBS用户位置信息对用户之间是否存在社会关系进行判断,是基于位置大数据的情报挖掘领域中的一个新兴问题,可为群体发现及社团划分提供信息支撑。以时空共现理论为依据,将时空共现区特征归纳为4类,提出了一种基于随机森林的用户社会关系判断方法。该方法包括特征选择和训练分类环节。首先,针对特征空间存在不相关和冗余特征而影响判断性能的问题,提出一种基于Fisher准则和χ2检验的特征选择算法,对无关、冗余特征进行剔除;然后采用随机森林进行分类判断,克服了现有方法训练速度慢、容易过拟合的问题。以LBSN用户Check-in数据为例进行的实验结果表明,该方法能够以较低的计算代价和较高的准确率实现社会关系的判断。Inferring social ties from the location information of LBS users, which can provide more information for group discovery and community detection, is now becoming a new problem in intelligence mining from location big data. Based on the theory of co-occurrences, the features of co-occurrences region were divided into four categories, and a new method based on random forests for social ties inferring was proposed in this paper. The method consists of feature selec- tion phase and classification phase. Firstly, for the problem that uneorrelatedand redundant features will affect the accuracy of result, an algorithm based on Fisher criterion and Z2 test was proposed to remove the uncorrelated and redundant features. Secondly, random forests was applied in the classification to overcome the problem of existing method that training phase is slow and the model is easily over-fitting. Check-in data of LBSN users is chosen as test data in experiment, the results indicate the feasibility and effectiveness of the method.
关 键 词:基于位置的服务 时空共现 随机森林 社会关系推断
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3