28 downloads
336 views

Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare,...

["Bioinformatics","Data Science","Databases","Emerging Technologies","World Wide Web and Web Science"]
doi:10.7717/peerj-cs.110
63 downloads
261 views

Docker allows packaging an application with its dependencies into a standardized, self-contained unit (a so-called container), which can be used for software development and to run the application on any system. Dockerfiles are declarative definitions of an environment...

["Data Science","Software Engineering"]
doi:10.7287/peerj.preprints.2905v1
46 downloads
398 views

Music transcription involves the transformation of an audio recording to common music notation, colloquially referred to as sheet music. Manually transcribing audio recordings is a difficult and time-consuming process, even for experienced musicians. In response,...

["Data Mining and Machine Learning","Data Science"]
doi:10.7717/peerj-cs.109
110 downloads
536 views

A detailed review of a recent data science book by Hadley Wickham and Garrett Grolemund is developed herein. Technical book reviews should provide a guide to the readers, a sense of the appropriate audience, the specifics of the software/language, and identify...

["Computational Biology","Data Science"]
doi:10.7287/peerj.preprints.2873v1
35 downloads
558 views

Despite recent algorithmic improvements, learning the optimal structure of a Bayesian network from data is typically infeasible past a few dozen variables. Fortunately, domain knowledge can frequently be exploited to achieve dramatic computational savings, and...

["Artificial Intelligence","Data Mining and Machine Learning","Data Science","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2872v1
20 downloads
23 views

Over the past 18 months, we have been working on a dashboard concept that enables researchers a means of interacting with existing research. This work was motivated by the National Data Service (NDS), which is an emerging vision of how scientists and researchers...

["Human-Computer Interaction","Computer Architecture","Data Science","World Wide Web and Web Science","Software Engineering"]
doi:10.7287/peerj.preprints.2845v1
653 downloads
533 views

ATLAS (Automatic Tool for Local Assembly Structures) is a comprehensive multi-omics data analysis pipeline that is massively parallel and scalable. ATLAS contains a modular analysis pipeline for assembly, annotation, quantification and genome binning of metagenomics...

["Bioinformatics","Computational Biology","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2843v1
4 downloads
42 views

Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. Nowadays, it is common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to...

["Bioinformatics","Computational Biology","Data Science","Databases","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2839v1
38 downloads
72 views

This study investigates the effects of using a large data set on supervised machine learning classifiers in the domain of Intrusion Detection Systems (IDS). To investigate this effect 12 machine learning algorithms have been applied. These algorithms are: (1) Adaboost,...

["Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.2838v1
10 downloads
48 views

Climate and biodiversity systems are closely interlaced across a wide range of scales. To better understand the mutual interaction between climate change and biodiversity there is a strong need for multidisciplinary skills, tools and a large variety of heterogeneous,...

["Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2834v1
10 downloads
221 views

Nowadays, the daily work of many research communities is characterized by an increasing amount and complexity of data. This makes it increasingly difficult to manage, access and utilize to ultimately gain scientific insights based on it. At the same time, domain...

["Data Science","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2831v1
71 downloads
275 views

MerCat (“ Mer - Cat enate”) is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. Using assembled contigs and raw sequence reads from any platform as input, MerCat performs k-mer...

["Bioinformatics","Computational Biology","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2825v1
11 downloads
71 views

Introduction Computational reproducibility refers to the possibility of reconstructing all the steps of a workflow that connects raw data, processed data and results: it is a fundamental issue in the omic studies because of the complex and high-dimensional nature...

["Bioinformatics","Computational Biology","Data Science"]
doi:10.7287/peerj.preprints.2823v1
7 downloads
34 views

The HUBzero platform is an infrastructure enabling online scientific communities to collaborate and share information and computational resources as they explore scientific phenomena. HUBzero has been adopted by the Regenstrief Center for Healthcare Engineering...

["Data Science","Visual Analytics"]
doi:10.7287/peerj.preprints.2819v1
128 downloads
831 views

Metastatic cutaneous melanoma is an aggressive skin cancer with some progression-slowing treatments but no known cure. The omics data explosion has created many possible drug candidates; however, filtering criteria remain challenging, and systems biology approaches...

["Bioinformatics","Computational Biology","Data Science","World Wide Web and Web Science"]
doi:10.7717/peerj-cs.106

Top subject areas - Articles & Preprints

Top subject areas - People

View all subject areas