检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:保志康 陈继璇 刘印晓 张茂源 章洪博 刘振安 魏晓娟[1] BAO Zhikang;CHEN Jixuan;LIU Yinxiao;ZHANG Maoyuan;ZHANG Hongbo;LIU Zhen'an;WEI Xiaojuan(College of Electrical Engineering,Northwest Minzu University,Lanzhou 730000,China)
机构地区:[1]西北民族大学电气工程学院,甘肃兰州730000
出 处:《生物化工》2024年第3期20-27,共8页Biological Chemical Engineering
基 金:国家自然科学基金项目(12205241);甘肃省自然科学基金项目(20JR10RA115);甘肃省高等学校创新基金项目(2022B-074);中央高校基本科研业务费专项资金资助(31920220049,31920230138)。
摘 要:DNA承载了生物体内的所有遗传信息,决定基因的结构和功能。对DNA所属类别进行预测,可以判断一个未知类是否为新物种、外来物种或者熟知物种。随着生物技术的发展,如何从获取到的DNA序列中提取完整信息并预测其序列组成,找到组成规律,准确反映物种特性成为生物信息学中的一个重要问题。本研究从NCBI网站上下载序列登录号为CP021707和CP085300的两类DNA序列文件,基于碱基频率和数量特征提取方法进行单碱基、双碱基和三碱基的特征提取,构建出84维、168维和35维特征向量,分别基于K近邻(K-Nearest Neighbor,KNN)、支持向量机(Support Vector Machine,SVM)以及K近邻和支持向量机融合(KNN-SVM)算法模型进行分类预测。实验结果表明,在168维特征向量下,基于KNN-SVM算法模型的分类准确率比基于KNN或SVM算法模型的分类准确率高,对判断一个未知类的相关特性具有积极意义。DNA carries all the genetic information in the organism,which determines the structure and function of the gene.Predicting the category of DNA can determine whether an unknown class is a new species,an alien species or a well-known species.With the development of biotechnology,how to extract complete information from the obtained DNA sequence and predict its sequence composition,find the composition rule,and accurately reflect the characteristics of the species has become an important issue in bioinformatics.In this study,two types of DNA sequence files with sequence registration numbers CP021707 and CP085300 are downloaded from the NCBI website.Based on the base frequency and quantitative feature extraction method,the feature extraction of single base,double base and triple base is carried out to construct 84-dimensional,168-dimensional and 35-dimensional feature vectors.Classification prediction is based on K-nearest neighbor(KNN),support vector machine(SVM)and K-nearest neighbor and support vector machine fusion(KNN-SVM)algorithm models respectively.The experimental results show that under the 168-dimensional feature vector,the classification accuracy based on KNN-SVM algorithm model is effectively improved compared with the classification accuracy based on KNN or SVM algorithm model,which is of positive significance for judging the relevant characteristics of an unknown class.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7