Motif clustering with implications for transcription factor interactions

Jan Grau; Ivo Grosse; Stefan Posch; Jens Keilwagen

doi:10.7287/peerj.preprints.1302v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

German Conference on Bioinformatics 2015 Collection thumbnail

Highlighted in German Conference on Bioinformatics 2015 Collection

Motif clustering with implications for transcription factor interactions

Jan Grau ¹, Ivo Grosse^1,2, Stefan Posch¹, Jens Keilwagen³

1 Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany

2 German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany

3 Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants, Quedlinburg, Germany

DOI: 10.7287/peerj.preprints.1302v1

Published: 2015-08-13
Accepted: 2015-08-13

Subject Areas: Bioinformatics, Computational Biology, Molecular Biology, Computational Science
Keywords: motif, ChIP-seq, de Buijn sequence, motif similarity, clustering, network

Copyright: © 2015 Grau et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.

Cite this article: Grau J, Grosse I, Posch S, Keilwagen J. 2015. Motif clustering with implications for transcription factor interactions. PeerJ PrePrints 3:e1302v1 https://doi.org/10.7287/peerj.preprints.1302v1

Abstract

High-throughput data, for instance ChIP-seq data, measure binding of transcription factors (TFs) or other proteins to DNA and have become a widespread data source for de-novo motif discovery. Often, several ChIP-seq data sets study the same TF under different conditions resulting in several, potentially redundant motifs, which demands for identification and clustering of similar motifs. Here, we propose a refined measure of motif similarity based on the correlation between score profiles on de Bruijn sequences. We demonstrate the utility of the proposed measure in benchmark studies on artificial motifs and motifs discovered from ENCODE ChIP-seq data. We use this measure to cluster motifs discovered from 757 different ENCODE ChIP-seq data sets for 166 TFs and RNA-polymerase II and III. Based on this clustering, we derive a TF interaction network that reflects many known TF-TF interactions, but also reveals novel putative interaction partners.

Author Comment

This work has been presented at the German Conference on Bioinformatics 2015.

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article