TY - JOUR UR - https://doi.org/10.7287/peerj.preprints.1302v1 DO - 10.7287/peerj.preprints.1302v1 TI - Motif clustering with implications for transcription factor interactions AU - Grau,Jan AU - Grosse,Ivo AU - Posch,Stefan AU - Keilwagen,Jens DA - 2015/08/13 PY - 2015 KW - motif KW - ChIP-seq KW - de Buijn sequence KW - motif similarity KW - clustering KW - network AB - High-throughput data, for instance ChIP-seq data, measure binding of transcription factors (TFs) or other proteins to DNA and have become a widespread data source for de-novo motif discovery. Often, several ChIP-seq data sets study the same TF under different conditions resulting in several, potentially redundant motifs, which demands for identification and clustering of similar motifs. Here, we propose a refined measure of motif similarity based on the correlation between score profiles on de Bruijn sequences. We demonstrate the utility of the proposed measure in benchmark studies on artificial motifs and motifs discovered from ENCODE ChIP-seq data. We use this measure to cluster motifs discovered from 757 different ENCODE ChIP-seq data sets for 166 TFs and RNA-polymerase II and III. Based on this clustering, we derive a TF interaction network that reflects many known TF-TF interactions, but also reveals novel putative interaction partners. VL - 3 SP - e1302v1 T2 - PeerJ PrePrints JO - PeerJ PrePrints J2 - PeerJ PrePrints SN - 2167-9843 ER -