机构地区:[1]吉林大学人工智能学院,吉林长春130000 [2]河北工业大学人工智能与数据科学学院,天津300401
出 处:《电子学报》2023年第6期1619-1636,共18页Acta Electronica Sinica
基 金:国家自然科学基金(No.62076109)。
摘 要:特征选择(Feature Selection,FS)是一种有效的数据预处理方法,它可以通过选择高维数据中一组具有高相关性和低冗余性的特征,从而解决数据冗余引起的维数灾难.目前许多计算方法已经被应用于求解FS问题,其中基于教与学优化(Teaching and Learning-based Optimization Algorithm,TLBO)的特征选择模型由于其高效的全局搜索能力受到越来越多学者的关注.然而,随着数据规模的不断扩大,这些算法所具有的模型不稳定、模型精确度低和局部搜索能力差等局限性,使算法的研究逐步陷入困境.为解决上述问题,本文提出了融合教与学优化算法与局部搜索方法(Local Search,LS)的混合进化Wrapper算法模型(Teaching and Learning-based Optimization-Local Search Algorithm,TLBOLS).首先,由于传统的教与学优化算法不能直接用于求解特征选择问题,算法在初始化阶段将实数型编码转为二进制编码,然后为保证种群的多样性,在教阶段引入最差个体重启机制,并针对进化班级过程中学习者与教学者两种身份采用不同值的TF值,提出二进制的教与学特征选择算法(Binary Teaching and Learning-based Optimization-Local Search Algorithm,BTLBOLS).随后,提出结合多操作的局部搜索方法和变邻域搜索逐渐增强扰动力度,提高整个种群的个体质量.为优化特征选择结果,BTLBOLS利用综合评价指标作为目标函数指导整体进化过程.实验选取45个高维癌症基因表达数据集进行测试并与十种特征选择算法相比,实验结果表明,相比其他算法,BTLBOLS在分类准确率和特征个数上都具有一定优势,算法分类性能有效提高.Feature selection(FS)is an effective data pre-processing method that solves the dimensionality disaster caused by data redundancy by selecting a set of features with high relevance and low redundancy in high-dimensional data.Many computational methods have been applied to solve the FS problem,among which the teaching and learning-based optimization algorithm(TLBO)feature selection model has received increasing attention from scholars due to its efficient global search capability.However,with the increasing size of data,the limitations of these algorithms,such as model insta⁃bility,low model accuracy and poor local search ability,have gradually put the research of the algorithms into difficulties.To address these problems,this paper proposes a hybrid evolutionary Wrapper algorithm model(Teaching and Learning-Based Optimization-Local Search algorithm,TLBOLS)that integrates teaching-learning optimization algorithms with local search methods.Firstly,the algorithm converts the real-type coding to binary coding in the initialization phase,then intro⁃duces the worst individual restart mechanism in the teaching phase,and proposes a binary teaching-learning feature selec⁃tion algorithm for the evolutionary class process using different values of TF values for the two identities of learners and pedagogues(Binary Teaching and Learning-Based Optimization-Local Search algorithm,BTLBOLS).Subsequently,a lo⁃cal search method combining multiple operations and variable neighborhood search is proposed to gradually enhance the perturbation strength and improve the individual quality of the whole population.To optimize the feature selection results,BTLBOLS utilizes a comprehensive evaluation metric as an objective function to guide the overall evolutionary process.Forty-five high-dimensional cancer gene expression datasets are selected for testing and compared with ten feature selection algorithms,and the experimental results show that compared to other algorithms,the BTLBOLS has certain advantages in terms of classification
关 键 词:教与学优化算法 局部搜索 新型Wrapper混合特征选择算法 特征选择 分类 基因表达数据
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...