SDAEC算法在单细胞测序数据批次校正中的应用  

SDAEC Method and its Application in Batch Effect Removal for Single Cell mRNA Sequence

在线阅读下载全文

作  者:王文杰[1] 李康[1] 谢宏宇 Wang Wenjie;Li Kang;Xie Hongyu(Department of Medical Statistics,Harbin Medical University,Harbin 150081)

机构地区:[1]哈尔滨医科大学卫生统计学教研室,150081 [2]浙江大学医学院附属妇产科医院临床研究中心

出  处:《中国卫生统计》2024年第4期501-506,共6页Chinese Journal of Health Statistics

基  金:国家自然科学基金(82003551);浙江省自然科学基金(LTGY24H160008)。

摘  要:目的 提出深度堆叠降噪自编码嵌套聚类(stacked denoising auto encoder embedded cluster, SDAEC)算法并用于单细胞mRNA测序(single cell mRNA sequence, scRNA-seq)数据的批次效应移除,对其移除批次效应性能进行评估。方法 基于单细胞数据具有高维度、高稀疏性及高度非线性误差特点,通过将单细胞Louvain聚类算法嵌入堆叠降噪自动编码器(stacked denoising auto encoder, SDAE)算法中,形成SDAEC算法,用于单细胞测序数据的批次效应移除。结合实际卵巢癌组织scRNA-seq数据,利用分布邻域嵌入(t-distributed stochastic neighbor embedding, tSNE)、 k最近邻批次效应检测(k-nearest-neighbor batch-effect test, kBET)、调整兰德系数(adjusted rand index, ARI)、标准化互信息(normalized mutual information, NMI)、平均轮廓宽度(average silhouette width, ASW)评价其移除批次效应性能。结果 利用SDAEC方法对scRNA-seq数据批次效应移除性能高于Combat、相互最近邻(mutual nearest neighbors, MNN)、分布匹配残差网络(maximum mean discrepancy distribution-matching residual networks, MMD-ResNet)和基于零膨胀负二项的方差提取法(zero-inflated negative binomial-based wanted variation extraction, ZINB-WaVE)。结论 SDAEC算法能够移除scRNA-seq数据的批次效应,提高scRNA-seq数据下游分析的有效性,具有实际应用价值。Objective To propose a deep stacked denoising auto encoder embedded cluster(SDAEC)algorithm and apply it to single cell mRNA sequence(scRNA-seq)data to remove the batch effect,and further to evaluate the performance of its batch effect removal.Methods Based on the characteristics of high dimension,high sparsity and high non-linear error of single-cell data,the algorithm of single cell Louvain clustering was embedded into stacked denoising auto encoder(SDAE)algorithm,and formed a SDAEC algorithm,which was used to batch effect removal for scRNA-seq data.SDAEC algorithm was utilized to scRNA-seq data of actual ovarian cancer tissue for batch effect removal,t-distributed stochastic neighbor embedding(tSNE),k-nearest-neighbor batch-effect test(kBET),adjusted rand index(ARI),normalized mutual information(NMI)and average silhouette width(ASW)were used to evaluate the performance of removing batch effect.Results The performance of SDAEC was better than Combat,mutual nearest neighbors(MNN),maximum mean discrepancy distribution-matching residual networks(MMD-ResNet)and zero-inflated negative binomial-based wanted variation extraction(ZINB-WaVE)in removing batch effect of scRNA-seq.Conclusion SDAEC algorithm can remove the batch effect of scRNA-seq data and improve the validity of downstream analysis of scRNA-seq data.

关 键 词:深度堆叠降噪自编码嵌套聚类 单细胞测序 批次效应 卵巢癌 

分 类 号:R195.1[医药卫生—卫生统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象