机构地区:[1]浙江工业大学计算机科学与技术学院,杭州310023
出 处:《小型微型计算机系统》2023年第1期146-154,共9页Journal of Chinese Computer Systems
基 金:国家自然科学基金面上项目(61972355)资助;浙江省公益技术研究计划项目(LGG19F020012)资助。
摘 要:异构信息网络中不同类型的节点与边能够形成丰富的语义关系,同时节点的文本属性也会对这些关系模式造成影响.相比于同构网络,异构网络的数据挖掘可以获得更有价值的结果,但是也因为异构网络节点和边的多样性使得异构网络挖掘更具有挑战性.设计有效的查询技术可以对异构网络进行网络结构和语义的分析.以往的异构网络查询方法通常采用基于元路径的图查询方法,但是如何更好地计算元路径的重要度并结合异构网络的节点文本属性进行准确率更高的查询仍然是一个需要解决的重要问题.此外,如何对查询的多个结果有效展示它们之间的语义关联和特征,对用户快速理解网络的异构关系模式也非常重要.本文受到图查询输入、子图查询和结果分析三个阶段任务的驱动,提出了一种结合短文本语义的图查询方法,并基于该方法实现了一个面向异构网络的图查询可视分析系统.本文首先从查询输入中提取可能的关系模式,使用元路径来表示不同语义的关系模式并结合用户输入的短文本计算重要度;然后本文根据元路径的重要度将多条元路径结合为用于查询的关系模式;再对查询得到的结果子图的特征向量进行降维和聚类,在此基础上对结果子图的结构特征、语义特征和节点属性进行可视化;最后本文设计并实现了Web环境下的异构信息网络图查询可视分析系统.在DBLP数据集上的实验结果表明,短文本对查询结果的约束性提高了查询的准确率;进一步通过豆瓣电影数据的案例分析,说明本文的系统可以基于查询有效分析挖掘异构网络的数据和关系特征.Different types of nodes and edges in a heterogeneous information network can form rich semantic relationships, and the text attributes of the nodes will also affect these relationship patterns.Compared with homogeneous networks, data mining on heterogeneous networks can obtain more valuable results, but the diversity of nodes and edges in heterogeneous networks makes mining more challenging.Effective query technology can efficiently analyze the network structure and semantics of heterogeneous networks.In the past, heterogeneous network query methods usually use graph query methods based on meta-paths.However, how to better calculate the importance of meta-paths and use the node text attributes of heterogeneous networks to perform higher-accuracy queries is still an important issue that needs to be solved.In addition, how to effectively display the semantic associations and characteristics between multiple results of the query is also very important for users to quickly understand the heterogeneous relationship mode of the network.Driven by the tasks of query input, subgraph query and result analysis, using the conception of meta-path to perform graph query on heterogeneous information networks(HINs),this paper proposes a graph query method combined with short text semantics and implements a graph query visual analysis system using this method.Firstly, possible relationship patterns are extracted from the query input, which are represented by the meta-paths, and the importance of these patterns are calculated with the short text from the query input;then multiple me-ta-paths are combined into relationship patterns for query according to the importance;after that, dimension reduction and clustering are performed on the feature vectors of the result subgraphs, based on which structural features, semantic features and node attributes are visualized;finally a Web-based HIN graph query visualization system is implemented.A verification experiment is conducted on the DBLP dataset, and the result shows that the constrai
关 键 词:可视化 可视分析 异构信息网络 图查询 元路径 短文本
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...