Feature-based augmentation and classification for tabular data  

在线阅读下载全文

作  者:Balachander Sathianarayanan Yogesh Chandra Singh Samant Prahalad S.Conjeepuram Guruprasad Varshin B.Hariharan Nirmala Devi Manickam 

机构地区:[1]Amrita School of Engineering,Amrita Vishwa Vidyapeetham,Coimbatore,India

出  处:《CAAI Transactions on Intelligence Technology》2022年第3期481-491,共11页智能技术学报(英文)

摘  要:Generating synthetic samples for a tabular data is a strenuous task.Most of the time,the columns(features)in the dataset may not follow an ideal distribution function.The objective of the proposed algorithm,Histogram Augmentation Technique(HAT),is to generate a dataset whose distribution is similar to that of the original dataset.This augmentation is achieved based on individual columns,where separate algorithms are designed for continuous and discrete columns.Humans also use features of an object for interpretation.When humans make a judgement,they notice prominent features and characterise the perceived object.However,conventional Machine Learning classifiers are designed and trained on the basis of samples.Taking the features as the basis for classification,Feature Importance Classifier(FIC)has been attempted in this work.FIC treats every feature independent of each other,and ranks the features based on its dependence with the classified label.It has been found that the FIC has the highest accuracy and has improved the accuracy by 5.54%on average,when it's compared to other classifiers.The suggested algorithms have been experimented on five datasets and compared with two augmentation algorithms and four state-of-the-art ML classification algorithms.

关 键 词:FUNCTION CLASSIFIER COLUMNS 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术] TP181[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象