检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Khalid Mahmood Aamir Muhammad Bilal Muhammad Ramzan Muhammad Attique Khan Yunyoung Nam Seifedine Kadry
机构地区:[1]Department of CS&IT,University of Sargodha,Sargodha,40100,Pakistan [2]Department of CS&IT,University of Mianwali,Mianwali,42200,Pakistan [3]School of Systems and Technology,University of Management and Technology,Lahore,54782,Pakistan [4]Department of Computer Science,HITEC University Taxila,Taxila,Pakistan [5]Department of Computer Science and Engineering,Soonchunhyang University,Asan,Korea [6]Faculty of Applied Computing and Technology,Noroff University College,Kristiansand,Norway
出 处:《Computers, Materials & Continua》2021年第12期3829-3844,共16页计算机、材料和连续体(英文)
基 金:This work was supported by the Soonchunhyang University Research Fund.
摘 要:Retroviruses are a large group of infectious agents with similar virion structures and replication mechanisms.AIDS,cancer,neurologic disorders,and other clinical conditions can all be fatal due to retrovirus infections.Detection of retroviruses by genome sequence is a biological problem that benefits from computational methods.The National Center for Biotechnology Information(NCBI)promotes science and health by making biomedical and genomic data available to the public.This research aims to classify the different types of rotavirus genome sequences available at the NCBI.First,nucleotide pattern occurrences are counted in the given genome sequences at the preprocessing stage.Based on some significant results,the number of features used for classification is reduced to five.The classification shall be carried out in two phases.The first phase of classification shall select only two features.Unclassified data in the first phase is transferred to the next phase,where the final decision is taken with the remaining three features.Three data sets of animals and human retroviruses are selected;the training data set is used to minimize the classifier’s number and training;the validation data set is used to validate the models.The performance of the classifier is analyzed using the test data set.Also,we use decision tree,naive Bayes,knearest neighbors,and vector support machines to compare results.The results show that the proposed approach performs better than the existing methods for the retrovirus’s imbalanced genome-sequence dataset.
关 键 词:RETROVIRUSES machine learning BIOINFORMATICS CLASSIFICATION
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49