检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王树林[1] 王戟[1] 陈火旺[1] 张鼎兴[1]
机构地区:[1]国防科技大学计算机学院
出 处:《计算机工程》2007年第9期40-42,共3页Computer Engineering
基 金:国家自然科学基金资助项目(60233020)
摘 要:基因组的结构与功能存在密切联系,其功能主要通过DNA子序列来表达,因此研究DNA序列结构对于生物信息学来说具有重要的意义。该文研究了k-长DNA子序列在DNA全序列中出现频数的计数问题,设计并实现了k-长DNA子序列内部计数算法和外部计数算法。该算法通过一个哈希函数把k-长DNA子序列映射为整数关键字从而把k-长DNA子序列出现频数的计数问题转化为整数关键字的重复计数问题,使得能够利用经典B树算法来解决k-长DNA子序列的出现频数计数问题。针对所要解决的问题提出3种改进措施以进一步提高算法的性能。There is a close relationship between the structures of whole genome and its functions which are expressed by its subsequences. Researching the structure of DNA sequence has a profound meaning to bioinformatics. The problem that all k-mers in whole genome are counted is researched. The internal and external algorithm which counts all k-mers occurrence in DNA sequences is designed and implemented. This algorithm translates the problem of counting all k-mers into the problem of counting integer keys with the help of a hash function which maps a k-mer to an integer, and it applies the classic B-tree algorithm to solve the problem of counting k-mers in DNA sequence. It proposes three measures to further improve the efficiency of the algorithm according to the feature of the counting problem.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222