基于集成SVM数据流分类算法的公司微博金融事件检测方法  被引量:3

COMPANY FINANCIAL EVENT DETECTION IN MICROBLOG BASED ON ENSEMBLE SVM DATA STREAM CLASSIFICATION ALGORITHM

在线阅读下载全文

作  者:夏千姿 倪丽萍[1,2] 倪志伟[1,2] 朱旭辉[1,2] 李想 Xia Qianzi;Ni Liping;Ni Zhiwei;Zhu Xuhui;Li Xiang(School of Management,Hefei University of Technology,Hefei 230009,Anhui,China;Key Laboratory of Process Optimization&Intelligent Decision-making,Ministry of Education,Hefei University of Technology,Hefei 230009,Anhui,China)

机构地区:[1]合肥工业大学管理学院,安徽合肥230009 [2]合肥工业大学过程优化与智能决策教育部重点实验室,安徽合肥230009

出  处:《计算机应用与软件》2021年第8期150-159,174,共11页Computer Applications and Software

基  金:国家自然科学基金青年科学基金项目(71301041);国家自然科学基金重大研究计划培育项目(91546108)。

摘  要:先前事件检测算法需要大量训练样本并且不能动态检测事件。为了从微博短文本中检测金融事件,提出一种从微博中检测公司金融事件的新模型。结合词嵌入与数据流集成分类算法,词嵌入和触发词典用于中文微博文本表示。带有动态时间窗的集成数据流分类算法(DSESVM)用于在线事件分类,大大减少了训练数据并动态检测事件。使用五家上市公司的微博文本作为语料库进行测试,实验结果表明,该方法不仅降低了训练样本的比例,还检测了概念漂移,可以有效提高微博中公司金融事件检测的准确性,相对于已有方法,其平均F1值提升5.6~7.2百分点。Previous event detection algorithms require large training samples and cannot monitor events dynamically.This paper proposes a new model for detecting company financial events from microblog short text.It combined word embedding and ensemble data stream classification algorithm,and word embedding and trigger-dictionary were utilized to represent Chinese microblog texts.Ensemble data stream classification algorithm with dynamic time windows(DSESVM)were used to classify events in online social media,which could greatly reduce training data and detect event dynamically.We used microblogs related to five listed companies as a corpus to test the accuracy of the proposed algorithm.The experimental results show that the method not only reduces the proportion of training samples,but also detects the concept drift,which can effectively improve the accuracy of company financial event detection.The detection of the F1 value is increased by 5-7 percentage points relative to the existing methods.

关 键 词:公司金融事件检测 集成分类算法 数据流挖掘 词嵌入 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象