基于文本特征融合的衍生性网络健康谣言识别模型研究  被引量:2

Research on Derivative Online Health Rumors Identification Modal Based on Text Feature Fusion

在线阅读下载全文

作  者:陈燕方 周晓英[2] Chen Yanfang;Zhou Xiaoying(Renmin University of China Libraries,Beijing 100872;School of Information Resource Management,Renmin University of China,Beijing 100872)

机构地区:[1]中国人民大学图书馆,北京100872 [2]中国人民大学信息资源管理学院,北京100872

出  处:《图书情报工作》2023年第14期73-84,共12页Library and Information Service

基  金:中国人民大学公共健康与疾病预防控制文理交叉重大创新平台“中央高校建设世界一流大学(学科)和特色发展引导专项资金”;国家社会科学基金重点项目“全媒体语境下的信息流行病学理论与实践研究”(项目编号:20AZD132)研究成果之一。

摘  要:[目的/意义]衍生性网络健康谣言生成门槛低,周期性强,危害影响深远,是网络健康谣言识别与治理中需要优先解决的重点问题之一,也是重要突破口。[方法/过程]借助深度语义表征和聚合方法,探索衍生性网络健康谣言文本内容的六要素特征;通过结合网络健康谣言的分布式语义特征预训练模型,构建包括六个类别、6287个词汇的网络健康谣言文本内容要素词库;在将健康谣言标题特征、内容文本六要素特征以及主体内容文本特征进行统一的向量空间表示与融合后,构建面向多源文本特征融合的网络健康谣言识别模型。[结果/结论]模型的实证研究表明:与已有的对照模型相比,本文所提出的文本特征融合模型使衍生性网络健康谣言识别的准确率有较好的提升,且丰富的可拓展健康谣言要素词库可为后续的研究提供较好的资源支持。[Purpose/Significance]Online derivative health rumors are characterized by low generation thresholds,strong periodicity,and far-reaching consequences.This is one of the key issues that need to be prioritized in the identification and goverance of online health rumors,and it is also an important breakthrough point.[Method/Process]Through the methods of deep semantic representation and aggregation,this paper explored six element features of the derivative text features of online health rumors.At the same time,combined with the distributed semantic features pre-trained model of online health rumors,the thesaurus of content elements of online health rumors(6 categories,6287 words in total)is obtained.Finally,through the unified vector space representation and fusion of title feature,six element features of health rumors content and main content feature,a online health rumor discrimination model framework based on multi-source text feature fusion was constructed.[Result/Conclusion]The empirical study of the model shows that text feature fusion model proposed in this paper has a significant improvement in the recognition of derivative online health rumors compared with the control model,and the abundant and expandable thesaurus of health rumor elements provides better resource support for subsequent research.

关 键 词:网络健康谣言 健康谣言识别 文本特征 文本挖掘 

分 类 号:R-05[医药卫生] G206[文化科学—传播学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象