1001 - A tool for binary representations of unordered multistate characters (with examples from genomic data)
- Published
- Accepted
- Subject Areas
- Bioinformatics, Computational Biology, Theory and Formal Methods
- Keywords
- binary characters, molecular characters, polarity, Cladistics, genomics, three-taxon statements, three-taxon analysis
- Copyright
- © 2015 Mavrodiev
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2015. 1001 - A tool for binary representations of unordered multistate characters (with examples from genomic data) PeerJ PrePrints 3:e1153v1 https://doi.org/10.7287/peerj.preprints.1153v1
Abstract
In modern molecular systematics, matrices of unordered multistate characters, such as DNA sequence alignments, are used for analysis with no further re-coding procedures nor any a priori determination of character polarity. Here we present 1001, a simple freely available Python-based tool that helps re-code matrices of non-additive characters as different types of binary matrices. Despite to the historical basis, our analytical approach to DNA and protein data has never been properly investigated since the beginning of the molecular age. The polarized matrices produced by 1001 can be used as the proper inputs for Cladistic analysis, as well as used as inputs for future three-taxon permutations. The 1001 binary representations of molecular data (not necessary polarized) may also be used as inputs for different parametric software. This may help to reduce the complicated sets of assumptions that normally precede either Bayesian or Maximum Likelihood analyses.
Author Comment
Here we present a simple freely available Python-based tool 1001 that helps re-code matrices of non-additive characters as different types of binary matrices.