知识关联视角下金融证券知识图谱构建与相关股票发现  被引量:7

Constructing Knowledge Graph for Financial Securities and Discovering Related Stocks with Knowledge Association

在线阅读下载全文

作  者:刘政昊 钱宇星 衣天龙 吕华揆 Liu Zhenghao;Qian Yuxing;Yi Tianlong;Lv Huakui(School of Information Management,Wuhan University,Wuhan 430072,China;Institute of Big Data,Wuhan University,Wuhan 430072,China;Center for Studies of Information Resources,Wuhan University,Wuhan 430072,China)

机构地区:[1]武汉大学信息管理学院,武汉430072 [2]武汉大学大数据研究院,武汉430072 [3]武汉大学信息资源研究中心,武汉430072

出  处:《数据分析与知识发现》2022年第2期184-201,共18页Data Analysis and Knowledge Discovery

基  金:国家自然科学基金重大研究计划重点支持项目(项目编号:91646206);科技创新2030-“新一代人工智能”重大项目课题(项目编号:2020AAA0108505)的研究成果之一。

摘  要:【目的】基于知识关联的研究视角构建领域知识图谱发现行业特征和相关股票,为投资者的组合交易决策提供新的视角和依据。【方法】首先构建以股票数据为核心的种子知识图谱,对非结构化的文本数据基于FinBERT预训练模型进行实体抽取和关系分类形成三元组,并将二者进行知识融合完成金融证券知识图谱构建;然后基于图谱利用链路预测、相似度计算等图数据挖掘算法发现相关股票及其隐含特征,并通过统计学方法进行初步验证。【结果】构建了具有111845个实体和163370个关系的金融证券知识图谱,基于图谱分析了与“东北证券”相似度最高的10支跨行业相关股票,并结合“四环生物”案例分析股票间潜在的非线性相关关系。【局限】所构建的知识图谱仅考虑了所属行业、股东持股等静态信息对股票相关性的影响。【结论】金融证券领域知识图谱的构建和相关股票发现为投资者制定有效的投资组合策略,为股票趋势预测提供强有力的分析思路和数据支持。[Objective]This paper constructs domain knowledge graph based on knowledge association and discovers industry characteristics and related stocks,aiming to improve investors’decision making.[Methods]Firstly,we constructed the“seed”knowledge graph with stock data.Then,we conducted entity extraction and relationship classification with unstructured text data based on FinBERT pre-training model to generate the triples.Third,we merged the seed graph and the triples to create the knowledge graph for financial securities.Fourth,based on the graph,link prediction,similarity calculation and other data mining algorithms,we discovered the related stocks and their hidden characteristics.Our findings were preliminarily verified by statistical methods.[Results]Our new knowledge graph was constructed with 111,845 entities and 163,370 relationships.We analyzed 10 cross-industry stocks having the highest similarity with“Northeast Securities”.We also examined the potential nonlinear correlation between stocks using“Sihuan Biology”.[Limitations]The constructed knowledge graph only included the impacts of static information(e.g.,industry and shareholder ownership)on stock correlation.[Conclusions]Our new knowledge graph provides strong data analytics support for investors to make effective portfolio strategies and predict stock trends.

关 键 词:知识关联 知识图谱 金融证券 图数据挖掘 股票发现 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] G353[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象