基于簇间分离性的稀有类识别算法  

Rare Category Detection Algorithm Based on Cluster Separability

在线阅读下载全文

作  者:严宣辉[1] 郭躬德[2] 

机构地区:[1]福建师范大学数学与计算机科学学院,福州350007 [2]福建师范大学网络安全与密码技术福建省重点实验室,福州350007

出  处:《模式识别与人工智能》2014年第6期502-508,共7页Pattern Recognition and Artificial Intelligence

基  金:国家自然科学基金项目(No.61175123);福建省高校产学合作科技重大项目(No.2010H6007)资助

摘  要:稀有类挖掘是数据挖掘的一个重要研究领域,具有广泛的应用背景.文中针对传统稀有类识别算法存在的缺陷,提出一种基于密度差异与簇间分离性判据相结合的稀有类识别算法(RDACS).该算法以特征权重相似度作为稀有类簇与周围数据样本间分离性的判据,并辅以积极学习的方法实现稀有类识别.在UCI公共数据集和KDD99数据集上的实验表明,与现有的同类算法相比,RDACS在询问次数指标上有较明显优势,能提高效率并减少人为误差,是现有稀有类识别方法的一种补充算法.The rare category mining, which is an important research field in data mining, is widely applied. Aiming at the defects of the traditional rare category recognition methods, an rare category detection algorithm based on cluster separability (RDACS), is proposed based on the combination of density difference and inter-cluster separability criterion for rare category mining. An active-learning scenario is used to detect rare category. The similarity of feature weight is applied to the separability of rare category cluster and its surrounding samples. The experimental results on UCI public datasets and KDD99 datasets show that compared with the existing similar algorithms, the RDACS algorithm has an advantage in the number of inquiries, which can significantly improve the efficiency and reduce human errors. RDACS is complementary to the existing rare category recognition methods.

关 键 词:稀有类  密度 特征权重 分离性 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象