Spotlight on: Practical Data Science for Stats PeerJ Collection

What better way to get back into the swing of term than a collection of resources on data science workflows and statistical analysis? Jennifer Bryan and Hadley Wickham have curated a set of preprints on the day-to-day aspects of data analysis, useful for the seasoned statistician and the occasional data wrangler alike. The goal of this collection is to increase the visibility and adoption of modern data analytical workflows.

Some notable standouts for the PeerJ community: Excuse me, do you have a moment to talk about version control? by Jennifer Bryan, The democratization of data science education by Sean Kross et al, and Data organization in spreadsheets by Karl Broman and Kara Woo.

PeerJ in the news: Blue-green dinosaur eggs are a big deal

You know how robins and other birds lay such beautiful blue-green eggs? Well, according to research published in PeerJ earlier this month, this can actually be traced back to their dinosaur ancestors. Researchers’ chemical analysis revealed the first record of the avian eggshell pigments protoporphyrin and biliverdin in the eggshells of Late Cretaceous oviraptorid dinosaurs.

National GeographicGizmodo and Live Science reported on the findings. “With new machines and new techniques, it’s very exciting what can potentially be found in fossils” – David Varricchio. Read the full study here: Wiemann et al. (2017) Dinosaur origin of egg color: oviraptors laid blue-green eggs.

Machine learning and editorial efficiency at PeerJ

Last month we announced to our editorial board that we are now using a statistical machine learning approach based on topic modeling to match submitted manuscripts to Academic Editors (starting with the PeerJ journal). With a month of data behind us now, one performance measure indicates just how well the new algorithm is doing.

But first, just what is “topic modeling?” Normal search algorithms often suffer due to confusing similar words. For example, when speaking about an ‘Apple’ are we talking about the company or the fruit? That can lead to poorly matched invitations. Topic modeling attempts to get around this by generating a set of topics that uncover hidden meanings. More to it than that of course, but the point is when used appropriately, topic modeling can lead to matches that make much more sense most of the time.

When editors are invited to a PeerJ journal manuscript, it will be bucketed into one of three levels based on how similar the submission is to the editor’s own research: “High”, “Mid”, and “Lower.” What we see in the data is that “high” matches are 12x as likely to be handled than “lower” matches. And “mid” matches are 5x as likely vs “lower.”

Ultimately, of course, what really matters (from the author perspective) is if your paper is handled more quickly and with greater eagerness and skill by the better matched editor and reviewers. To measure that, we are gathering continuous feedback from the editorial board and we’ll keep you updated.

Charlie makes an appearance!

With Open Access Week coming up next month, our cute-as-a-button monkey mascot, Charlie, is ready and on-call for our advocacy and outreach efforts.

Read more on the Origin of Charlie, our beloved mascot

Latest from the PeerJ blog

We interviewed author Carly Strasser on her study, ‘Estimated effects of implementing an open access policy for grantees at a private foundation‘ which looked at the effects of The Gordon and Betty Moore Foundation’s policy to require grantees to publish in open access journals.

For Mesothelioma Awareness Day we interviewed editor Emanuela Felley-Bosco about her work on translating mesothelioma research into clinical outcomes. The Mesothelioma Cancer Alliance shared why access to rare cancer research is vital for patient engagement and their own continued advocacy efforts.

