一种基于文本信息的三层过滤系统的设计  被引量:1

Design of a Three-layer Filtration System Based on Text Message

在线阅读下载全文

作  者:胡柳[1] 周立前[1] 黄丽君[1] 

机构地区:[1]湖南工业大学计算机与通信学院,湖南株洲412008

出  处:《计算机技术与发展》2013年第4期135-138,共4页Computer Technology and Development

基  金:湖南省国际合作基金项目(2011WK3032)

摘  要:为了提高文本信息过滤的效率,提出一种基于文本信息的三层过滤系统。系统分为横向二部分、纵向三层次的结构,在信息过滤时第一层采用基于IP、URL地址的过滤方式;第二层为关键词频与权重的统计,对信息标题、关键词及正文内容三部分分别计算统计值;第三层为内容特征分析过滤,同时引入分词、关键词权重计算、VSM与主题倾向分析技术,保证不良信息识别的高效与准确。实验表明系统具有较好的过滤效果,查全率和查准率明显优于KNN方法,在实时信息过滤时能及时阻止不良信息的传播。In order to improve the efficiency of text information filtering, a system of three-layer filtration based on text message is put forward. The system is divided into horizontal two parts and vertical three-tier structure, the first layer of information filtering is based on IP and URL address filtering, the second layer is based on the statistics of keyword frequency and weights, including information title, keywords and text content three parts to calculate the statistical value. The third layer is based on analysis of filter content features, while the split words, keywords weighting, VSM and theme tendency analysis is led in the system, to ensure the efficiency and accuracy of the bad information to identify. The experiments are shown that the system has a better filtering effect of the recall and precision significantly than the KNN method, timely to prevent the spread of bad information in real time information filtering.

关 键 词:文本信息 三层过滤 向量空间模型 主题倾向 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象