机构地区:[1]清华大学自动化系,清华大学北京市中医药交叉研究所北京信息科学与技术国家研究中心,北京100084
出 处:《中国科学:信息科学》2022年第5期856-869,共14页Scientia Sinica(Informationis)
基 金:国家自然科学基金项目(批准号:62061160369,81630103,81225025)资助。
摘 要:在生物医学大数据时代,如何全面有效地发现致病基因、药物靶标等关键要素,从整体上理解宏观表型的微观本质,是目前信息科学与中西医学交叉研究面临的重大共性挑战之一.生物系统是典型的复杂系统,克服上述挑战的关键是:如何通过深入理解复杂生物系统的“关系”本质,解决复杂系统多层次信息融合难题以及生物医学大数据中广泛存在的维度高、噪声大、样本少等难点.“生物网络”是构成复杂生物系统的基础,反映人体内部基因和基因产物等各种生物分子的相互关系、生物分子与疾病和药物等不同层次的关系,生物网络已被广泛用于生物医学大数据的分析.李梢课题组从20余年前开始中西医药与生物网络的关联研究,率先提出“网络靶标”假说,并进行方法构建与应用.本文对基于生物网络的关系推断理论与方法进行总结与思考.首先,从原理上,发现疾病宏观表型与微观分子在复杂生物网络中存在“层次模块化关系”,即宏观层次的涌现在微观上具有局域模块性,宏观表型越相似,微观致病基因或药物靶标在网络上的模块性关联越强.其次,从方法上,给出基于生物网络从生物医学大数据、少量目标样本中推断关键生物要素的“关系推断”一般性方法框架:以层次模块化关系为基础,从全局角度进行关系网络构建、关系表示与建模、未知关系推断,实现关系的实体化、数学化、整体化.进而,从应用上,基于生物网络的关系推断方法在致病基因与药物靶标预测、疾病标志物识别、中医药机制解析等方面表现出很好的性能.综上,关系推断方法能够为从系统角度和分子水平揭示中西医药科学原理提供系统解决方案,也为网络药理学等新兴学科提供重要的原理和方法学支撑.In the era of big biomedical data,systematically discovering key elements,including diseasecausing genes and/or drug targets,and understanding the micro-level nature of macro-level phenotypes in a holistic fashion have remained a common challenge for information science,Western medicine,and traditional Chinese medicine(TCM).The key to overcoming the challenge is how to solve the problems of multi-scale information fusion and high-dimensionality,high-noise,and small-scale samples that exist in biomedical data,through the in-depth understanding of the“relationship”nature of complex biological systems(CBSs),as biology is a typical complex system.Biological networks,as the basis of CBSs,reflect the interrelationships of various biological molecules such as genes and gene products in the human body,as well as those between biological molecules and diseases and drugs at different levels.Biological networks have been widely used in biomedical sciences analysis of data.We started the research on the relationship between Chinese and Western medicines and complex biological networks(CBNs)more than 20 years ago,and took the lead in proposing the hypothesis of a“network target”,and proceeded with the method construction and application.In principle,this article uncovers a novel relationship named as“multilevel modular relationship”,between macro-level phenotypes and micro-level molecules based on CBNs,and discusses CBN-based“relationship inference”.It reveals that the macro-level emergence has local modularity at the micro-level,and the more similar the macro-level phenotypes,the stronger the modular relationships among micro-level molecules(disease-causing genes or drug targets).Methodologically,we further establish a general CBN-based computational framework for the relationship inference to infer key elements from big biomedical data with a small number of positive samples,from a global perspective.It consists of three parts:(1)relationship network construction,(2)relationship representation and modeling,a
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...