机构地区:[1]School of Economics and Management, Beifing Information Science &Technology University, Beifing, 100192, China [2]China Academy of Aerospace Systems Science and Engineering, Beijing, 100048, China [3]Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China [4]Beijing Key Lab of Green Development Decision Based on Big Data, Beijing, 100192, China [5]University of Chinese Academy of Sciences, Beijing, 100049, China
出 处:《Journal of Systems Science and Systems Engineering》2018年第6期709-726,共18页系统科学与系统工程学报(英文版)
摘 要:Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented in this paper. Through effectiveness analysis, Support Vector Machine based on Bag-Of-Words (BOW-SVM) is adopted for challenge validation, and the distributed document embeddings of BBS posts generated by Paragraph Vector are applied to feasibility study. Based on BOW-SVM, cross-validations of BBS posts labeled by different groups and annotators are conducted. The big fluctuation of cross-validation results indicates the differences of individual risk perceptions, which brings more challenges to societal risk classification. Furthermore, based on the distributed document embeddings of BBS posts, the pairwise similarities of more than 300 thousands BBS posts from different societal risk categories are compared. The higher similarities of BBS posts in the same societal risk category reveal that BBS posts in the same societal risk category share more features than BBS posts in different categories, which manifests the feasibility of societal risk classification of BBS posts, and also reflects the possibility to improve the performance of societal risk monitoring.Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented in this paper. Through effectiveness analysis, Support Vector Machine based on Bag-Of-Words (BOW-SVM) is adopted for challenge validation, and the distributed document embeddings of BBS posts generated by Paragraph Vector are applied to feasibility study. Based on BOW-SVM, cross-validations of BBS posts labeled by different groups and annotators are conducted. The big fluctuation of cross-validation results indicates the differences of individual risk perceptions, which brings more challenges to societal risk classification. Furthermore, based on the distributed document embeddings of BBS posts, the pairwise similarities of more than 300 thousands BBS posts from different societal risk categories are compared. The higher similarities of BBS posts in the same societal risk category reveal that BBS posts in the same societal risk category share more features than BBS posts in different categories, which manifests the feasibility of societal risk classification of BBS posts, and also reflects the possibility to improve the performance of societal risk monitoring.
关 键 词:Societal risk classification Tianya Forum cross validation pairwise similarity individual risk perception
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...