Preprints (not yet peer-reviewed)

Many current and future data scientists will be "isolated"---working alone or in small teams within a larger organization. This isolation brings certain challenges as well as freedoms. Drawing on my considerable experience both working in the professional sports...

["Computer Education","Data Science","Scientific Computing and Simulation","Social Computing"]
doi:10.7287/peerj.preprints.3160v1
4 downloads
21 views

Data analysis, statistical research, and teaching statistics have at least one thing in common: these activities all produce many files! There are data files, source code, figures, tables, prepared reports, and much more. Most of these files evolve over the course...

["Computer Education","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.3159v1
5 downloads
16 views

We describe the \textsc{Coefficient-Flow} algorithm for calculating the bounding chain of an $(n-1)$--boundary on an $n$--manifold-like simplicial complex $S$. We prove its correctness and show that it has a computational time complexity of $O(|S^{(n-1)}|)$ (where...

["Algorithms and Analysis of Algorithms","Data Science","Scientific Computing and Simulation"]
doi:10.7287/peerj.preprints.3151v1
161 downloads
1,010 views

Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data....

["Statistics","Data Science"]
doi:10.7287/peerj.preprints.3139v1
2,088 downloads
7,644 views

Despite growing interest in Open Access (OA) to scholarly literature, there is an unmet need for large-scale, up-to-date, and reproducible studies assessing the prevalence and characteristics of OA. We address this need using oaDOI, an open online service that...

["Legal Issues","Science Policy","Data Science"]
doi:10.7287/peerj.preprints.3119v1
12 downloads
93 views

Bacterial surfaces are complex, built of from membranes, peptide-glycans and, importantly, proteins. The proteins play crucial roles as the key regulator of how the bacterium interacts with its environment. A full catalog of the motifs in coiled-coil proteins and...

["Bioinformatics","Infectious Diseases","Data Science"]
doi:10.7287/peerj.preprints.3118v1
31 downloads
155 views

Sigmoidal and double-sigmoidal dynamics are commonly observed in many areas of biology. Here we present sicegar, an R package for the automated fitting and classification of sigmoidal and double-sigmodial data. The package categorizes data into one of three categories,...

["Bioinformatics","Computational Biology","Mathematical Biology","Statistics","Data Science"]
doi:10.7287/peerj.preprints.3116v1
91 downloads
366 views

Sharing and reusing data in research is a welcome and encouraged practice since it maximises the scientific outcomes given limited financial, material and human resources. Interdisciplinary research is considered to benefit from this practice, uniting researchers...

["Bioinformatics","Computational Biology","Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.2248v4
24 downloads
135 views

Background. There is huge amount of full-text biomedical literatures available in public repositories like PubMed Central (PMC). However, a substantial number of the papers are in Portable Document Format (PDF) and do not provide plain text format ready for text...

["Bioinformatics","Data Science","Databases","Digital Libraries","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.2993v1
245 downloads
649 views

A detailed review of a recent data science book by Hadley Wickham and Garrett Grolemund is developed herein. Technical book reviews should provide a guide to the readers, a sense of the appropriate audience, the specifics of the software/language, and identify...

["Computational Biology","Data Science"]
doi:10.7287/peerj.preprints.2873v1
772 downloads
865 views

ATLAS (Automatic Tool for Local Assembly Structures) is a comprehensive multi-omics data analysis pipeline that is massively parallel and scalable. ATLAS contains a modular analysis pipeline for assembly, annotation, quantification and genome binning of metagenomics...

["Bioinformatics","Computational Biology","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2843v1
9 downloads
127 views

Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. Nowadays, it is common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to...

["Bioinformatics","Computational Biology","Data Science","Databases","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2839v1
124 downloads
393 views

MerCat (“ Mer - Cat enate”) is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. Using assembled contigs and raw sequence reads from any platform as input, MerCat performs k-mer...

["Bioinformatics","Computational Biology","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2825v1
19 downloads
117 views

Introduction Computational reproducibility refers to the possibility of reconstructing all the steps of a workflow that connects raw data, processed data and results: it is a fundamental issue in the omic studies because of the complex and high-dimensional nature...

["Bioinformatics","Computational Biology","Data Science"]
doi:10.7287/peerj.preprints.2823v1
243 downloads
1,512 views

Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare,...

["Bioinformatics","Data Science","Databases","Emerging Technologies","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.2522v2
What is a PeerJ Preprint?

A PeerJ Preprint is a draft of an article, abstract, or poster that has not yet been peer-reviewed for formal publication. Submit a draft, incomplete, or final version of your work for free.

Submissions today can be approved by Editorial Staff and online in 24 hours.

Establish precedent. Solicit feedback. Publish updates.

Top subject areas - Preprints

Top subject areas - People

View all subject areas