Preprints (not yet peer-reviewed)

4 downloads
30 views

Predicting disease status for a complex human disease using genomic data is an important, yet challenging, step in personalized medicine. Among many challenges, the so-called curse of dimensionality problem results in unsatisfied performances of many state-of-art...

["Bioinformatics","Computational Biology","Genomics","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27123v1
9 downloads
26 views

The increasing availability of open data and the demand to understand better the nature of anomalies and the causes underlying them in modern systems is encouraging researchers to analyse open datasets in various ways. These include both quantitative and qualitative...

["Data Mining and Machine Learning","Data Science","Security and Privacy","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.27116v1
9 downloads
29 views

Remote homology detection is the problem of detecting homology in cases of low sequence similarity. It is a hard computational problem with no approach that works well in all cases. Methods based on profile hidden Markov models (HMM) often exhibit relatively higher...

["Bioinformatics","Artificial Intelligence","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27111v1
1,140 downloads
1,725 views

Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation...

["Biogeography","Soil Science","Computational Science","Data Mining and Machine Learning","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26693v3
68 downloads
94 views

With the popularization of the CRISPR-Cas gene editing system there has been an explosion of new techniques made possible by this versatile technology. However, the computational field has lagged behind with a current lack of computational tools for developing...

["Bioinformatics","Genomics","Synthetic Biology","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.27094v1
103 downloads
397 views

The success of personalized medicine does not only rely on methodological advances but also on the availability of data to learn from. While the generation and sharing of large data sets is becoming increasingly easier, there is a remarkable lack of diversity within...

["Genomics","Neuroscience","Ethical Issues","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27079v1
312 downloads
776 views

Computational models in biology encode molecular and cell biological processes. These models often can be represented as biochemical reaction networks. Studying such networks, one is mostly interested in systems that share similar reactions and mechanisms. Typical...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.1479v4
593 downloads
1,079 views

Potential Natural Vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location non-impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This...

["Biogeography","Computational Biology","Plant Science","Data Mining and Machine Learning","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26811v2
35 downloads
90 views

Tree species classification using hyperspectral imagery is a challenging task due to the high spectral similarity between species and large intra-species variability. This paper proposes a solution using the Multiple Instance Adaptive Cosine Estimator (MI-ACE)...

["Plant Science","Computational Science","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27052v1
116 downloads
176 views

Introduction: Sleep scoring is an important step in the treatment of sleep disorders. Manual annotation of sleep stages is time-consuming and experience-relevant and, therefore, needs to be done using machine learning techniques. methods: Sleep-edf polysomnography...

["Bioinformatics","Neuroscience","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.27020v1
80 downloads
234 views

In recent years, the pharmaceutical industry has been confronted with rising R&D costs paired with decreasing productivity. Attrition rates for new molecules are tremendous, with a substantial number of molecules failing in an advanced stage of development. Repositioning...

["Bioinformatics","Drugs and Devices","Computational Science","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.27002v1
44 downloads
257 views

Bacterial small non-coding RNAs (sRNAs) are involved in the control of several cellular processes. Hundreds of putative sRNAs have been identified in many bacterial species through RNA sequencing. The existence of putative sRNAs is usually validated by Northern...

["Bioinformatics","Microbiology","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.26974v1
112 downloads
203 views

To accelerate scientific progress on remote tree classification—as well as biodiversity and ecology sampling—The National Institute of Science and Technology created a community-based competition where scientists were invited to contribute informatics methods for...

["Ecology","Data Mining and Machine Learning","Data Science","Forestry","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26971v1
79 downloads
219 views

Background. Biogeographers assess how species distributions and abundances affect the structure, function, and composition of ecosystems. Yet we face a major challenge: it is difficult to precisely map species across landscapes. Novel Earth observations could obviate...

["Biogeography","Ecology","Data Mining and Machine Learning","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26972v1
368 downloads
691 views

Ecology has reached the point where data science competitions, in which multiple groups solve the same problem using the same data by different methods, will be productive for advancing quantitative methods for tasks such as species identification from remote sensing...

["Ecology","Data Mining and Machine Learning","Data Science","Forestry","Spatial and Geographic Information Science"]
doi:10.7287/peerj.preprints.26966v1
What is a PeerJ Preprint?

A PeerJ Preprint is a draft of an article, abstract, or poster that has not yet been peer-reviewed for formal publication. Submit a draft, incomplete, or final version of your work for free.

Submissions today can be approved by Editorial Staff and online in 24 hours.

Establish precedent. Solicit feedback. Publish updates.

Refine by manuscript type

Top subject areas - Preprints

Top subject areas - People

View all subject areas