Preprints (not yet peer-reviewed)

4 downloads
30 views

Predicting disease status for a complex human disease using genomic data is an important, yet challenging, step in personalized medicine. Among many challenges, the so-called curse of dimensionality problem results in unsatisfied performances of many state-of-art...

["Bioinformatics","Computational Biology","Genomics","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27123v1
9 downloads
26 views

The increasing availability of open data and the demand to understand better the nature of anomalies and the causes underlying them in modern systems is encouraging researchers to analyse open datasets in various ways. These include both quantitative and qualitative...

["Data Mining and Machine Learning","Data Science","Security and Privacy","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.27116v1
9 downloads
29 views

Remote homology detection is the problem of detecting homology in cases of low sequence similarity. It is a hard computational problem with no approach that works well in all cases. Methods based on profile hidden Markov models (HMM) often exhibit relatively higher...

["Bioinformatics","Artificial Intelligence","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27111v1
32 downloads
54 views

This paper addresses two questions related to reproducibility within the context of research related to computer science. First, requirements on reproducibility are analyzed based on a survey addressed to researchers in the academic and private sector. The survey...

["Data Science","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.27082v1
103 downloads
397 views

The success of personalized medicine does not only rely on methodological advances but also on the availability of data to learn from. While the generation and sharing of large data sets is becoming increasingly easier, there is a remarkable lack of diversity within...

["Genomics","Neuroscience","Ethical Issues","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27079v1
33 downloads
64 views

Recent increase in the production of high-resolution digital elevation models (DEMs) from lidar data has led to interest in their use for terrain mapping. Although the impact of different resolutions has been studied relative to terrain characteristics like roughness,...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.27072v1
312 downloads
776 views

Computational models in biology encode molecular and cell biological processes. These models often can be represented as biochemical reaction networks. Studying such networks, one is mostly interested in systems that share similar reactions and mechanisms. Typical...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.1479v4
35 downloads
90 views

Tree species classification using hyperspectral imagery is a challenging task due to the high spectral similarity between species and large intra-species variability. This paper proposes a solution using the Multiple Instance Adaptive Cosine Estimator (MI-ACE)...

["Plant Science","Computational Science","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27052v1
1,381 downloads
4,244 views

There has been much discussion of a "replication crisis" related to statistical inference, which has largely been attributed to overemphasis on and abuse of hypothesis testing. Much of the abuse stems from failure to recognize that statistical tests not only test...

["Science and Medical Education","Science Policy","Statistics","Data Science"]
doi:10.7287/peerj.preprints.26857v2
80 downloads
234 views

In recent years, the pharmaceutical industry has been confronted with rising R&D costs paired with decreasing productivity. Attrition rates for new molecules are tremendous, with a substantial number of molecules failing in an advanced stage of development. Repositioning...

["Bioinformatics","Drugs and Devices","Computational Science","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27002v1
112 downloads
203 views

To accelerate scientific progress on remote tree classification—as well as biodiversity and ecology sampling—The National Institute of Science and Technology created a community-based competition where scientists were invited to contribute informatics methods for...

["Ecology","Data Mining and Machine Learning","Data Science","Forestry","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26971v1
368 downloads
691 views

Ecology has reached the point where data science competitions, in which multiple groups solve the same problem using the same data by different methods, will be productive for advancing quantitative methods for tasks such as species identification from remote sensing...

["Ecology","Data Mining and Machine Learning","Data Science","Forestry","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26966v1
312 downloads
615 views

We propose a simple neural network model which can learn relation between sentences by passing their representations obtained from Long Short Term Memory(LSTM) through a Relation Network. The Relation Network module tries to extract similarity between multiple...

["Artificial Intelligence","Data Science","Natural Language and Speech"]
doi:10.7287/peerj.preprints.26847v2
504 downloads
579 views

A problem facing healthcare record systems throughout the world is how to share the medical data with more stakeholders for various purposes without sacrificing data privacy and integrity. Blockchain, operating in a state of consensus, is the underpinning technology...

["Computer Networks and Communications","Cryptography","Data Science","Security and Privacy"]
doi:10.7287/peerj.preprints.26942v1
79 downloads
145 views

Flow cytometry (FCM) is a powerful analytical tool that is widely used worldwide, as it allows the depiction of the innate complexity of a vast range of biological systems in few seconds. It is a technique based on the spectroscopic properties of suspended particles...

["Bioinformatics","Computational Biology","Ecology","Ecosystem Science","Data Science"]
doi:10.7287/peerj.preprints.26934v1
What is a PeerJ Preprint?

A PeerJ Preprint is a draft of an article, abstract, or poster that has not yet been peer-reviewed for formal publication. Submit a draft, incomplete, or final version of your work for free.

Submissions today can be approved by Editorial Staff and online in 24 hours.

Establish precedent. Solicit feedback. Publish updates.

Refine by manuscript type

Top subject areas - Preprints

Top subject areas - People

View all subject areas