基于依存文法的垃圾短信自动识别  被引量:2

Spam message filtering based on dependency grammar

在线阅读下载全文

作  者:易军凯[1] 罗会明[1] 

机构地区:[1]北京化工大学信息科学与技术学院,北京100029

出  处:《北京化工大学学报(自然科学版)》2013年第B12期81-85,共5页Journal of Beijing University of Chemical Technology(Natural Science Edition)

摘  要:针对现今垃圾短信泛滥的现状,给出了一种基于依存文法的组合特征选取的中文短信过滤方法。该方法通过对短信进行句法分析,将依存关系较强的词合并处理,组合成更能代表短信内容的组合特征,并通过使用文本分类算法来对短信进行分类。基于依存文法的组合特征方法考虑了词与词之间的关系,以一种更符合人类思考问题的方式来选择短信特征,在一定程序上融合了部分语义信息。实验结果表明,基于依存文法的中文文本短信的分类方法在中文短信过滤中有更好的分类效果。In view of the flood of spam short message,a dependency relation based Chinese short message classification approach was proposed in this paper.In this approach,through short text syntax analyzing,words with strong dependency relation were combined into new features,and then these new features were used to classify the short text through text classification algorithm.The relationship of different words was considered in the feature selection method based on the dependency grammar.Dependency relation based approach used a more suitable way,which was better in keeping with people thinking and incorporating some semantic information,to extract short text features.Experimental results showed that the proposed approach in this paper had better classification effects compared with traditional approaches.

关 键 词:垃圾短信 短信过滤 特征提取 依存语法 支持向量机 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象