检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:龚海燕 张司臣 张晓彤[2,3] GONG Haiyan;ZHANG Sichen;ZHANG Xiaotong(Institute for Advanced Materials and Technology,University of Science and Technology Beijing,Beijing 100083,P.R.China;School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing 100083,P.R.China;Shunde Innovation School,University of Science and Technology Beijing,Foshan,Guangdong 528399,P.R.China)
机构地区:[1]北京科技大学新材料技术研究院,北京100083 [2]北京科技大学计算机与通信工程学院,北京100083 [3]北京科技大学顺德创新学院,广东佛山528399
出 处:《生物医学工程学杂志》2024年第3期552-559,共8页Journal of Biomedical Engineering
基 金:国家重点研发项目(2023YFB3812901);中国博士后科学基金资助项目(2023M740219);国家资助博士后研究人员计划(GZC20230239)。
摘 要:高通量染色质构象捕获(Hi-C)技术的快速发展为染色质结构分析提供了丰富的基因组位点间交互作用数据,但目前基于Hi-C数据的已有拓扑相关结构域(TAD)识别方法存在准确率低、参数敏感等问题。在此背景下,本文设计并实现了一种基于空间密度聚类的TAD识别方法。该方法首先对原始Hi-C数据进行预处理,得到归一化后的Hi-C接触矩阵数据;然后计算位点之间的距离矩阵,基于位点的核心距离和可达距离生成可达性图,并提取聚类簇;最后基于聚类结果和TAD提取边界。该方法能够识别出内聚性更高的TAD结构,且TAD边界处富集了更多的ChIP-seq因子。实验结果表明,本文方法在TAD识别中更准确,更具有现实意义。The rapid development of high-throughput chromatin conformation capture(Hi-C)technology provides rich genomic interaction data between chromosomal loci for chromatin structure analysis.However,existing methods for identifying topologically associated domains(TADs)based on Hi-C data suffer from low accuracy and sensitivity to parameters.In this context,a TAD identification method based on spatial density clustering was designed and implemented in this paper.The method preprocessed the raw Hi-C data to obtain normalized Hi-C contact matrix data.Then,it computed the distance matrix between loci,generated a reachability graph based on the core distance and reachability distance of loci,and extracted clustering clusters.Finally,it extracted TAD boundaries based on clustering results.This method could identify TAD structures with higher coherence,and TAD boundaries were enriched with more ChIP-seq factors.Experimental results demonstrate that our method has advantages such as higher accuracy and practical significance in TADidentification.
关 键 词:高通量染色质构象捕获 拓扑相关结构域 染色质结构 密度聚类 生物信息学分析
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49