模糊语法方法在犯罪文本分类中的应用  被引量:2

Application of fuzzy grammar method in crime text classification

在线阅读下载全文

作  者:刘莹[1] 王宁[1] 李保华[2] 罗强[3] 

机构地区:[1]重庆邮电大学移通学院计算机科学系,重庆401520 [2]郑州成功财经学院信息工程系,河南郑州451200 [3]重庆邮电大学计算机科学与技术学院,重庆400065

出  处:《计算机工程与设计》2017年第7期1965-1971,共7页Computer Engineering and Design

基  金:重庆市教委科学技术研究基金项目(KJ1402002)

摘  要:针对现有文本分类模型的表现度低且缺乏容易理解的表示形式,提出一种模糊语法方法,并将其应用到犯罪文本分类中。根据定义的语法词典,将文本转化为模糊语法;将派生语法结合为更加紧凑的语法,获取学习文本模型的一般表示;将学习的模糊语法与测试集合进行匹配,根据解析隶属度的级别进行分类。与支持向量机、朴素贝叶斯、boosting方法和k最近邻等方法相比,所提FGM算法的性能与其它机器学习方法的性能类似;学习模型发生轻微变化,不需要传统方式的重建;在有些方面更加突出,如其在对爆炸事件的分类中获取了最高的F值(83%左右)以及最高的查准度(94%左右);很容易整合到更加全面的语法系统中。Concerning the poor performance of the existing text classification model and the lack of easy-understand representation,a fuzzy grammar method was proposed,which was applied in the classification of criminal text.The text was transformed into fuzzy grammar according to the defined grammar dictionary.The derivative grammar was combined with a more compact syntax,and the general representation of the learning text model was obtained.The fuzzy grammar was matched with the test set,and the classification was carried out according to the classification of the analytic membership grade.Compared with the methods of support vector machines,naive Bias,boosting method and K nearest neighbors,the performance of the proposed FGM algorithm is similar to that of other machine learning methods.Although there is a slight change in the learning model,it does not require the reconstruction like traditional methods.There are some areas more prominent.For example,the proposed FGM algorithm used in the explosion achieves the highest F value(about 83%),as well as the highest degree of accuracy(about94%).In addition,the FGM method is easy to integrate into a more comprehensive grammar system.

关 键 词:模糊语法 文本分类 派生语法 学习 隶属度 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象