76 downloads
404 views

A detailed review of a recent data science book by Hadley Wickham and Garrett Grolemund is developed herein. Technical book reviews should provide a guide to the readers, a sense of the appropriate audience, the specifics of the software/language, and identify...

["Computational Biology","Data Science"]
doi:10.7287/peerj.preprints.2873v1
20 downloads
498 views

Despite recent algorithmic improvements, learning the optimal structure of a Bayesian network from data is typically infeasible past a few dozen variables. Fortunately, domain knowledge can frequently be exploited to achieve dramatic computational savings, and...

["Artificial Intelligence","Data Mining and Machine Learning","Data Science","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2872v1
15 downloads
14 views

Over the past 18 months, we have been working on a dashboard concept that enables researchers a means of interacting with existing research. This work was motivated by the National Data Service (NDS), which is an emerging vision of how scientists and researchers...

["Human-Computer Interaction","Computer Architecture","Data Science","World Wide Web and Web Science","Software Engineering"]
doi:10.7287/peerj.preprints.2845v1
500 downloads
298 views

ATLAS (Automatic Tool for Local Assembly Structures) is a comprehensive multi-omics data analysis pipeline that is massively parallel and scalable. ATLAS contains a modular analysis pipeline for assembly, annotation, quantification and genome binning of metagenomics...

["Bioinformatics","Computational Biology","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2843v1
2 downloads
25 views

Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. Nowadays, it is common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to...

["Bioinformatics","Computational Biology","Data Science","Databases","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2839v1
9 downloads
51 views

This study investigates the effects of using a large data set on supervised machine learning classifiers in the domain of Intrusion Detection Systems (IDS). To investigate this effect 12 machine learning algorithms have been applied. These algorithms are: (1) Adaboost,...

["Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.2838v1
8 downloads
30 views

Climate and biodiversity systems are closely interlaced across a wide range of scales. To better understand the mutual interaction between climate change and biodiversity there is a strong need for multidisciplinary skills, tools and a large variety of heterogeneous,...

["Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2834v1
6 downloads
192 views

Nowadays, the daily work of many research communities is characterized by an increasing amount and complexity of data. This makes it increasingly difficult to manage, access and utilize to ultimately gain scientific insights based on it. At the same time, domain...

["Data Science","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2831v1
51 downloads
221 views

MerCat (“ Mer - Cat enate”) is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. Using assembled contigs and raw sequence reads from any platform as input, MerCat performs k-mer...

["Bioinformatics","Computational Biology","Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2825v1
8 downloads
57 views

Introduction Computational reproducibility refers to the possibility of reconstructing all the steps of a workflow that connects raw data, processed data and results: it is a fundamental issue in the omic studies because of the complex and high-dimensional nature...

["Bioinformatics","Computational Biology","Data Science"]
doi:10.7287/peerj.preprints.2823v1
3 downloads
26 views

The HUBzero platform is an infrastructure enabling online scientific communities to collaborate and share information and computational resources as they explore scientific phenomena. HUBzero has been adopted by the Regenstrief Center for Healthcare Engineering...

["Data Science","Visual Analytics"]
doi:10.7287/peerj.preprints.2819v1
90 downloads
647 views

Metastatic cutaneous melanoma is an aggressive skin cancer with some progression-slowing treatments but no known cure. The omics data explosion has created many possible drug candidates; however, filtering criteria remain challenging, and systems biology approaches...

["Bioinformatics","Computational Biology","Data Science","World Wide Web and Web Science"]
doi:10.7717/peerj-cs.106
14 downloads
184 views

Motivated by the increasing amount of voices who ask for careful consideration of what context-rich data analysis methods can tell us about the activities of human collectives, we contribute an argumentation that employs a dialectic of literature on the philosophy...

["Data Mining and Machine Learning","Data Science","Network Science and Online Social Networks","Social Computing","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.2789v1
18 downloads
63 views

This document describes a novel way to extract structure information from plain text using Markov Decision Process. In the age of big data, unstructured information such as text, photos and videos be- comes abundant. However, data warehouse requires structured...

["Data Science"]
doi:10.7287/peerj.preprints.2774v1
54 downloads
379 views

While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series,...

["Data Science","Digital Libraries","Emerging Technologies","World Wide Web and Web Science"]
doi:10.7717/peerj-cs.105

Top subject areas - Articles & Preprints

Top subject areas - People

View all subject areas