区间值决策表中测试代价敏感的属性约简方法  

Test-cost-sensitive attribute reduction method in interval-valued decision tables

在线阅读下载全文

作  者:范译文 廖淑娇 吴迪[1,2,3,4] FAN Yiwen;LIAO Shujiao;WU Di(School of Mathematics and Statistics,Minnan Normal University,Zhangzhou 363000,China;Fujian Key Laboratory of Granular Computing and Application,Minnan Normal University,Zhangzhou 363000,China;Institute of Meteorological Big Data-Digital Fujian,Minnan Normal University,Zhangzhou 363000,China;Key Laboratory of Data Science and Statistics,Minnan Normal University,Zhangzhou 363000,China)

机构地区:[1]闽南师范大学数学与统计学院,福建漳州363000 [2]闽南师范大学福建省粒计算及其应用重点实验室,福建漳州363000 [3]闽南师范大学数字福建气象大数据研究所,福建漳州363000 [4]闽南师范大学数据科学与统计重点实验室,福建漳州363000

出  处:《江苏海洋大学学报(自然科学版)》2025年第1期51-61,共11页Journal of Jiangsu Ocean University:Natural Science Edition

基  金:国家自然科学基金资助项目(12101289);福建省自然科学基金面上项目(2024J01800)。

摘  要:在当前的大数据时代,数据处理至关重要。代价敏感学习是机器学习、数据挖掘等领域研究热点之一,而测试代价是一种重要的代价,数据处理往往要考虑到测试代价。但是,目前较少有基于测试代价去考虑区间值数据的属性约简。针对该情况,讨论了区间值决策表中测试代价敏感的属性约简问题,创建了相应的粗糙集理论模型,提出了测试代价相关的加权属性重要度函数,并设计了测试代价敏感属性约简的回溯算法和启发式算法。最后在多个UCI数据集上进行实验,检验了所提出算法的有效性。该算法相较于现有的两个属性约简算法,在降低总测试代价方面具有显著的优势。回溯算法总是可以得到最优约简,而启发式算法能较高效率地得到最优或次优的约简。In the current era of big data,data processing is very critical.Cost sensitive learning is one of the current research hot spots in machine learning,data mining and other fields.Test cost is an important type of cost,and data processing often needs to take test costs into account.However,there are few attribute reduction methods that consider interval-valued data based on test costs.In response to this situation,this paper discusses the problem of test-cost-sensitive attribute reduction in interval-valued decision tables,creates a corresponding rough set theory model,presents a test-cost-related weighted attribute significance function,and designs a backtracking algorithm and a heuristic algorithm for test-cost-sensitive attribute reduction.Finally,experiments are conducted on multiple UCI datasets to verify the effectiveness of the proposed algorithms.Compared with the two existing attribute reduction algorithms,the proposed algorithms have great advantages in reducing the total test cost.The backtracking algorithm can always obtain the optimal reduction,and the heuristic algorithm can obtain the optimal or sub-optimal reduction eficiently.

关 键 词:区间值 决策表 属性约简 测试代价 不一致对象 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象