检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]海南大学信息学院,海口570228
出 处:《现代计算机》2008年第12期86-88,共3页Modern Computer
摘 要:在文本分类领域中,目前较常用到的特征选择算法都是通过某种评价函数分别计算单个特征对类别的区分能力,仅仅考虑了特征与类别之间的关联性,而对特征与特征之间的关联性没有予以足够的重视,这导致了特征集往往存在着冗余。针对这一问题,提出一种新的用于文本分类的特征选择算法,它可以帮助选出区分能力强、弱相关的特征。经实验验证,该方法比传统的特征选择算法具有更好的性能。In the field of text classification, the commonly used feature selection algorithm is through some kind of evaluation function to calculate the individual characteristics of the distinction between types of capacity, just considers the characteristics and the type of relevance, and the characteristics and features between the relevance of not be enough attention, which leads to the feature set often there is a redundancy. According to this problem, proposes a new classification for the text of the feature selection algorithm, it can help to distinguish between strong and weak-related features. The experimental proves, this method has better performance than the traditional feature selection algorithm.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.65