Just enter your email
A peer-reviewed article of this Preprint also exists.
L 37: Perhaps some comments about whether OTL will be around for a long time or not? I imagine they plan to be, but would be nice to tell readers that so they sense this is a long-term thing that won't go away
Reading this, I was curious how the OTL name system fits in to the various naming systems out there. Maybe some discussion of that?
I imagine name resolution is going to be a big one. Do the built in functions that work with names cover most use cases? Or is something more needed sometimes, e.g., http://resolver.globalnames.org/
First of all, great package! I look forward to using it. Thanks too for using PeerJ PrePrints, I'm not as motivated to post comments elsewhere but I like the StackOverflow-style points system here, so it motivates me to bother.
My comments here solely concern the written paper, not the actual software. (I will try and play with the software later if I get the time to...)
Line 35: I think a reference to Magee et al's paper (1) would fit well here. It's very much in the same vein as the Stoltzfus et al & Drew et al papers.
Lines 50-55: I'm surprised to see no love/citation for phangorn (2). Perhaps we live in different parallel worlds of phylogenetic R packages but that would be high on my list to mention in such a section of relevant packages! I do note you reference a more comprehensive list later.
Line 60: I'm calling bullshit on R as "...an ideal platform for reproducible research in phylogenetics and comparative biology...".
I looked up the definition of 'ideal' just to make sure I wasn't wrong and here's what I found:
A) "satisfying one's conception of what is perfect; most suitable."
B) "a person or thing regarded as perfect."
At the very least describing R as 'ideal' for phylogenetics shouldn't pass peer review without more supporting evidence / discussion. It's also fairly trivial to state that use of R may "allow a complete record of the steps taken in gathering, processing, and analyzing a given data set to be produced". That's true of any open source, well documented workflow, it's not special to R.
In my experience R is good for handling small, relatively 'downstream' phylogenetic data. For instance, objectively-speaking, R is not and likely will never be the key language/programme for computing phylogenies from raw data. It simply isn't 'speedy' enough. At best it can provide a wrapper around the specialised programs that do the actual hard work. Not one of PAUP, TNT, RAXML, MrBayes, RevBayes, BEAST, POY, MEGA, or anything else most people use to infer phylogeny are written in R.
In terms of sheer computational performance, perhaps Julia (3) might in future be a more promising language than R for phylogenetics? Phylogenetics packages are currently being built for Julia (4) and performance-wise it wouldn't surprise me if they were clearly superior to R equivalents. Admittedly Julia hasn't got a mature set of phylogenetics packages at the moment but you used the word 'ideal' and my point is in terms of performance R is demonstrably far from 'ideal' relative to implementations other languages! As I understand it, Julia is better for parallel computing and thus will perform faster permutations & bootstraps - key types of calculations performed in phylogenetics.
I'd encourage the authors to back away from the word 'ideal' here. R sure is popular with biologists at the moment, but it is not perfect or ideal. It could potentially be surpassed in both popularity, utility AND performance in future by other languages like Julia.
1. Magee AF, May MR, Moore BR (2014) The Dawn of Open Access to Phylogenetic Data. PLoS ONE 9(10): e110268. doi:10.1371/journal.pone.0110268
2. Schliep, K. P. 2011. phangorn: phylogenetic analysis in r. Bioinformatics 27:592-593. http://dx.doi.org/10.1093/bioinformatics/btq706
3. Bezanson, J., Karpinski, S., Shah, V. B., and Edelman, A. 2012. Julia: A fast dynamic language for technical computing. http://arxiv.org/abs/1209.5145