To increase transparency, PeerJ operates a system of 'optional signed reviews and history'. This takes two forms: (1) peer reviewers are encouraged, but not required, to provide their names (if they do so, then their profile page records the articles they have reviewed), and (2) authors are given the option of reproducing their entire peer review history alongside their published article (in which case the complete peer review process is provided, including revisions, rebuttal letters and editor decision letters).
The revised version has addressed most of the relevant comments.
The review reports are rather critical. Among others, it is required that the authors must be able to justify the validity of the database content.
The abstract looks concise to me.
I cannot find any figure or table for the manuscript. It would be best if the authors could include some figures and tables for user-friendly explanation. It is important as the manuscript is a database paper.
The experimental setting "pattern 12A allowing for one mismatch" has to be carefully supported on the manuscript. Have the authors tried other experimental setting since this setting can significantly affect the database content ?
The sentence "Given that no genomic database reports polyA tracks in coding sequences" may be arguable, please look at the following databases and state your difference to them:
It is very nice that the authors have released the scripts which can recreate the
database from the scratch on user's own computer.
The sentence "...is of course a bit simplistic...." is not appropriate on a scientific manuscript.
The first studies
The past studies
gathered researchers' attention
drew researchers' attention
The paper reports generation of a database of coding region polyA sequences from 250 genomes and web service to look this up.
While databases are useful for researchers to look things up in their studies, I am not sure this justifies a scientific report as there is not much experiments or science in this report.
There was also no mention of the validation of the search results at all:should we just trust that the results by the processes reported are reliable? Are all the sequences reported in coding regions? without sequencing errors? I think it is troubling that there is no mentioning of quality examination of the results.
The paper reported the process of collecting the polyA sequences in coding regions. There was no mention of validity check of the results and quality control.
No way to judge.
It is essential to validate the results in the database. It is also important to survey the distribution of such sequences in a few model genomes.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.