面向审计领域的短文本分类技术研究  被引量:7

Study on Short Text Categorization Technology Oriented Towards Field of Auditing

在线阅读下载全文

作  者:伍洋[1] 钟鸣[1] 姜艳[2] 李石君[1] 

机构地区:[1]武汉大学计算机学院,湖北武汉430072 [2]湖北省审计厅计算机中心,湖北武汉430072

出  处:《微电子学与计算机》2015年第1期5-10,共6页Microelectronics & Computer

基  金:国家自然科学基金(61272109;61202036)

摘  要:针对审计问题这种短文本所具有的特征稀疏、问题类别界限模糊问题,提出了一种改进的面向审计领域的短文本分类方法.该方法首先为审计问题构造了专门的特征集,以审计领域的同义词词集和法规库为基础,并结合特定规则来调整特征权重,然后以修改的SVM决策树作为多类分类器进行短文本分类.实验结果表明,该方法在对审计问题分类的应用上,具有较为满意的正确率,能满足实际的分类需求.To deal with the problems of feature sparseness and fuzzy boundaries of categorization exists in classification of audit problems,an improved short text categorization method oriented towards field of auditing is put forward.Firstly,a specialized feature set is builded for audit problems,the primary calculation method is designed based on synonym word set oriented towards field of auditing as well as designated rules,law library.Then the feature weight of those words with highly similarity to target words is adjusted.Finally,the SVM decision tree is used as multi-class classifier for short text classification.Experimental results show that a satisfy result can be got with this method from problem categorization of audit reports and it can be used in practical needs.

关 键 词:审计问题分类 审计领域 信息增益 SVM决策树 短文本分类 审计报告 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象