检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]同济大学化学系,上海200092
出 处:《同济大学学报(自然科学版)》2007年第11期1548-1551,1561,共5页Journal of Tongji University:Natural Science
基 金:国家自然科学基金资助项目(20275026)
摘 要:在现有生物统计中,对脱氧核糖核酸中碱基的编码表达主要限于腺嘌呤,鸟嘌呤,胞嘧啶和胸腺嘧啶4种.但这种编码方式的变量太少,同时没有考虑碱基在脱氧核糖核酸中的位置信息,在剪切位点预测中,准确率不会超过90%.据此采用基于知识的编码方式,即真剪切位点与假剪切位点的统计差表,结合支持向量机方法,大大提高了剪切位点识别的准确率,并进一步采用碱基的统计特征的多变量编码方式使真给体位点和假给体位点的预报率分别达到96.4%和93.0%,真受体位点和假受体位点的预报率分别达到94.4%和93.0%.In biological statistics, the encoding of bases or nucleotides is usually limited to four types ie. adenine (A) , cytosine (C), guanine (G) and thymine (T) for DNA. Two issues make the biological statistics imperfect with such encoding when one refers to the DNA sequences. One is that the number of types is too small; the other is that the encoding of the same nucleotide is always the same no matter where the nucleotide is. In splice sites prediction, for example, the accuracy is lower than ninety per- cent though the sequences adjacent to the splice sites have a high conservation. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algo- rithms adopted, and little attention to solving the fundamental issue, namely, nucleotide encoding. In this paper, a predictor is constructed to predict the true and false splice sites for higher eukaryotes based on support vector machines. The results show that the accuracy for the prediction of true donor sites and pseudo-sites are 96.3 %, 93.1% respectively, and the accuracy for prediction of true acceptor sites and pseudo-sites are 94.0 %, 93.1% respectively.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222