Motif clustering with implications for transcription factor interactions

Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants, Quedlinburg, Germany
DOI
10.7287/peerj.preprints.1302v1
Subject Areas
Bioinformatics, Computational Biology, Molecular Biology, Computational Science
Keywords
motif, ChIP-seq, de Buijn sequence, motif similarity, clustering, network
Copyright
© 2015 Grau et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Grau J, Grosse I, Posch S, Keilwagen J. 2015. Motif clustering with implications for transcription factor interactions. PeerJ PrePrints 3:e1302v1

Abstract

High-throughput data, for instance ChIP-seq data, measure binding of transcription factors (TFs) or other proteins to DNA and have become a widespread data source for de-novo motif discovery. Often, several ChIP-seq data sets study the same TF under different conditions resulting in several, potentially redundant motifs, which demands for identification and clustering of similar motifs. Here, we propose a refined measure of motif similarity based on the correlation between score profiles on de Bruijn sequences. We demonstrate the utility of the proposed measure in benchmark studies on artificial motifs and motifs discovered from ENCODE ChIP-seq data. We use this measure to cluster motifs discovered from 757 different ENCODE ChIP-seq data sets for 166 TFs and RNA-polymerase II and III. Based on this clustering, we derive a TF interaction network that reflects many known TF-TF interactions, but also reveals novel putative interaction partners.

Author Comment

This work has been presented at the German Conference on Bioinformatics 2015.