检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]武汉大学计算机学院,湖北武汉430072 [2]湖北省审计厅计算机中心,湖北武汉430072
出 处:《微电子学与计算机》2015年第1期5-10,共6页Microelectronics & Computer
基 金:国家自然科学基金(61272109;61202036)
摘 要:针对审计问题这种短文本所具有的特征稀疏、问题类别界限模糊问题,提出了一种改进的面向审计领域的短文本分类方法.该方法首先为审计问题构造了专门的特征集,以审计领域的同义词词集和法规库为基础,并结合特定规则来调整特征权重,然后以修改的SVM决策树作为多类分类器进行短文本分类.实验结果表明,该方法在对审计问题分类的应用上,具有较为满意的正确率,能满足实际的分类需求.To deal with the problems of feature sparseness and fuzzy boundaries of categorization exists in classification of audit problems,an improved short text categorization method oriented towards field of auditing is put forward.Firstly,a specialized feature set is builded for audit problems,the primary calculation method is designed based on synonym word set oriented towards field of auditing as well as designated rules,law library.Then the feature weight of those words with highly similarity to target words is adjusted.Finally,the SVM decision tree is used as multi-class classifier for short text classification.Experimental results show that a satisfy result can be got with this method from problem categorization of audit reports and it can be used in practical needs.
关 键 词:审计问题分类 审计领域 信息增益 SVM决策树 短文本分类 审计报告
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.13