检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谭卓昆 罗龙飞 王顺芳[1] TAN Zhuokun;LUO Longfei;WANG Shunfang(School of Information Science and Engineering,Yunnan University,Kunming 650500,China)
机构地区:[1]云南大学信息学院,昆明650500
出 处:《计算机工程与应用》2023年第5期70-77,共8页Computer Engineering and Applications
基 金:国家自然科学基金(62062067);云南省智能系统与计算重点实验室开放课题(ISC22Z01)。
摘 要:单一生物数据网络提供的特征信息是十分受限的,针对这一问题,提出了一种基于半监督自编码器的多网络特征融合方法,丰富特征信息。此外,为解决在人为设置模型的超参数时,易出现模型性能较低、陷入局部最优等问题,进一步提出了利用遗传算法优化支持向量机(GA-SVM算法)模型的方法,提高脑部疾病基因的预测性能。构建来自不同数据源的相似性数据网络,利用重启随机游走算法从四个数据网络中提取特征,通过半监督自编码器进行处理及融合,在十折交叉验证的策略下使用GA-SVM算法模型预测脑部疾病基因,并与其他算法进行比较。实验结果表明,在PD数据集上的AUC和AUPR值分别为0.805、0.792,而在MDD数据集上的AUC和AUPR值分别为0.825、0.823,均优于已有的预测模型,有效证明了该方法能够提高脑部疾病基因的预测效果。The feature information provided by a single biological data network is limited. Aiming at this problem, a multi-network feature fusion method based on semi-supervised autoencoder is proposed to enrich feature information. In addition, in order to solve the problem of artificially setting the hyperparameters of the model, it is easy to cause problems such as low model performance and falling into local optimum, it is further proposed to use the genetic algorithm to optimize the support vector machine(GA-SVM algorithm)to improve the predictive performance of brain disease genes. First, the similarity data networks from different data sources are constructed, then the features are extracted from the four data networks by using the random walk with restart algorithm, and processed and fused by semi-supervised autoencoder, finally,under the strategy of 10-fold cross validation, GA-SVM algorithm model is used to predict disease genes, and compared with other algorithms. The experimental results show that the AUC and AUPR values on the PD dataset are 0.805 and 0.792,respectively, while the AUC and AUPR values on the MDD dataset are 0.825 and 0.823, respectively, which are superior to the existing models. It is proved that this method can effectively improve the prediction effect of brain disease genes.
关 键 词:GA-SVM算法 多网络融合 半监督自编码器 脑部疾病基因 十折交叉验证
分 类 号:TP39[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49