Construction of microRNA functional families by a mixture model of position weight matrices

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea
University of Ulsan College of Medicine, Seoul, Korea
School of Computer Science and Engineering, Seoul National University, Seoul, Korea
DOI
10.7287/peerj.preprints.59v1
Subject Areas
Bioinformatics, Computational Biology, Genomics
Keywords
mixture model, microRNA, EM algorithm, machine learning, position weight matrix, sequence analysis
Copyright
© 2013 Rhee et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Cite this article
Rhee J, Shin S, Zhang B. 2013. Construction of microRNA functional families by a mixture model of position weight matrices. PeerJ PrePrints 1:e59v1

Abstract

MicroRNAs (miRNAs) are small regulatory molecules that repress the translational processes of their target genes by binding to their 3' untranslated regions (3' UTRs). Because the target genes are predominantly determined by their sequence complementarity to the miRNA seed regions (nucleotides 2-7) which are evolutionarily conserved, it is inferred that the target relationships and functions of the miRNA family members are conserved across many species. Therefore, detecting the relevant miRNA families with confidence would help to clarify the conserved miRNA functions, and elucidate miRNA-mediated biological processes. We present a mixture model of position weight matrices for constructing miRNA functional families. This model systematically finds not only evolutionarily conserved miRNA family members but also functionally related miRNAs, as it simultaneously generates position weight matrices representing the conserved sequences. Using mammalian miRNA sequences, in our experiments, we identified potential miRNA groups characterized by similar sequence patterns that have common functions. We validated our results using score measures and by the analysis of the conserved targets. Our method would provide a way to comprehensively identify conserved miRNA functions.

Author Comment

This paper was submitted to the InCoB 2013 Conference (International Conference on Bioinformatics). It was accepted for presentation at the Conference and is also being submitted to PeerJ.

Supplemental Information

Sequence similarities represented by Weblogo in (a) c10, (b) c63, (c) c44 and (d) c36

DOI: 10.7287/peerj.preprints.59v1/supp-1

microRNA lists in each group

DOI: 10.7287/peerj.preprints.59v1/supp-2

GO (Molecular Function) enrichment analyses for target genes in each group

DOI: 10.7287/peerj.preprints.59v1/supp-3

GO (Biological Process) enrichment analyses for target genes in each group

DOI: 10.7287/peerj.preprints.59v1/supp-4