TopDom : An efficient and Deterministic Method for identifying Topological Domains in Genomes

Hanjun Shin, Yi Shi, Chao Dai, Harianto Tjong, Ke Gong, Frank Alber, and Xianghong Jasmine Zhou

Abstract

The recent development of genome-wide proximity ligation assays (Hi-C and its variant TCC) allow the identification of chromatin contacts at unprecedented resolution. Several studies revealed that mammalian chromosomes are composed of mega-base topological domains (TDs), which appear to be conserved across cell types and to some extent even between organisms. Identifying topological domains becomes an important step towards understanding the structure and functions of spatial genome organizations. However, current methods for TD identification generally demand extensive computational resources, require tuning for many parameters, and encounter inconsistencies in results generated by different methods, or even among results generated by the same methods but with different parameter settings. In this work, we propose an efficient and deterministic method, TopDom, to identify TDs, along with a set of evaluation methods. Comparing with the mostly used approaches, our method is not only computationally more efficient, but depends on only a single parameter, for which we provide a simple guideline for its parameterization. We show that our method can more accurately identify TDs based on its definition. Moreover, TDs identified by our method provide even stronger support for the cross-tissue TD conservation. Furthermore, our analysis revealed that the location of house-keeping genes are highly associated with the cross-tissue conserved TDs.