Automatic Rule Discovery for Data Transformation Using Fusion of Diversified Feature Formats  

在线阅读下载全文

作  者:G.Sunil Santhosh Kumar M.Rudra Kumar 

机构地区:[1]Department of CSE,Jawaharlal Nehru Technological University,Anantapur,515002,India [2]Department of CSE,Marri Laxman Reddy Institute of Technology and Management,Hyderabad,500043,India [3]Department of Information Technology,Mahatma Gandhi Institute of Technology,Hyderabad,500075,India

出  处:《Computers, Materials & Continua》2024年第7期695-713,共19页计算机、材料和连续体(英文)

摘  要:This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed herein utilizes the fusion of diversified feature formats,specifically,metadata,textual,and pattern features.The goal is to enhance the system’s ability to discern and generalize transformation rules fromsource to destination formats in varied contexts.Firstly,the article delves into the methodology for extracting these distinct features from raw data and the pre-processing steps undertaken to prepare the data for the model.Subsequent sections expound on the mechanism of feature optimization using Recursive Feature Elimination(RFE)with linear regression,aiming to retain the most contributive features and eliminate redundant or less significant ones.The core of the research revolves around the deployment of the XGBoostmodel for training,using the prepared and optimized feature sets.The article presents a detailed overview of the mathematical model and algorithmic steps behind this procedure.Finally,the process of rule discovery(prediction phase)by the trained XGBoost model is explained,underscoring its role in real-time,automated data transformations.By employingmachine learning and particularly,the XGBoost model in the context of Business Rule Engine(BRE)data transformation,the article underscores a paradigm shift towardsmore scalable,efficient,and less human-dependent data transformation systems.This research opens doors for further exploration into automated rule discovery systems and their applications in various sectors.

关 键 词:XGBoost business rule engine machine learning categorical query language humanitarian computing environment 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象