Preprints (not yet peer-reviewed)

150 downloads
928 views

We present a CUDA based implementation of a decision tree construction algorithm within the gradient boosting library XGBoost. The tree construction algorithm is executed entirely on the GPU and shows high performance with a variety of datasets and settings, including...

["Artificial Intelligence","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.2911v1
50 downloads
609 views

Despite recent algorithmic improvements, learning the optimal structure of a Bayesian network from data is typically infeasible past a few dozen variables. Fortunately, domain knowledge can frequently be exploited to achieve dramatic computational savings, and...

["Artificial Intelligence","Data Mining and Machine Learning","Data Science","Distributed and Parallel Computing"]
doi:10.7287/peerj.preprints.2872v1
60 downloads
91 views

This study investigates the effects of using a large data set on supervised machine learning classifiers in the domain of Intrusion Detection Systems (IDS). To investigate this effect 12 machine learning algorithms have been applied. These algorithms are: (1) Adaboost,...

["Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.2838v1
7 downloads
53 views

This paper presents the latest developments on the VIALACTEA Science Gateway in the context of the FP7 VIALACTEA project. This science gateway operates as a central workbench for the VIALACTEA community in order to allow astronomers to process the new-generation...

["Data Mining and Machine Learning","Distributed and Parallel Computing","Scientific Computing and Simulation"]
doi:10.7287/peerj.preprints.2818v2
238 downloads
405 views

Software energy consumption is a performance related non-functional requirement that complicates building software on mobile devices today. Energy hogging applications are a liability to both the end-user and software developer. Measuring software energy consumption...

["Data Mining and Machine Learning","Mobile and Ubiquitous Computing","Software Engineering"]
doi:10.7287/peerj.preprints.2419v3
54 downloads
588 views

The nonparametric minimum hypergeometric (mHG) test is a popular alternative to Kolmogorov-Smirnov (KS)-type tests for determining gene set enrichment. However, these approaches have not been compared to each other in a quantitative manner. Here, I first perform...

["Computational Biology","Algorithms and Analysis of Algorithms","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.1962v3
64 downloads
93 views

Recognition of human emotions from the imaging templates is useful in a wide variety of human-computer interaction and intelligent systems applications. However, the automatic recognition of facial expressions using image template matching techniques suffer from...

["Human-Computer Interaction","Algorithms and Analysis of Algorithms","Artificial Intelligence","Computer Vision","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.2794v1
24 downloads
230 views

Motivated by the increasing amount of voices who ask for careful consideration of what context-rich data analysis methods can tell us about the activities of human collectives, we contribute an argumentation that employs a dialectic of literature on the philosophy...

["Data Mining and Machine Learning","Data Science","Network Science and Online Social Networks","Social Computing","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.2789v1
121 downloads
290 views

Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.2778v1
30 downloads
53 views

Feature selection in machine learning is of great interest since it is reckoned as creating more efficient predictive models in several engineering domains. It is even of special importance in the pulp and paper transformation industry as the knowledge of this...

["Artificial Intelligence","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.2749v1
35 downloads
137 views

Flight simulators are systems composed of numerous off-the-shelf components that allow pilots and maintenance crew to prepare for common and emergency flight procedures for a given aircraft model. A simulator must follow severe safety specifications to guarantee...

["Data Mining and Machine Learning","Scientific Computing and Simulation","Software Engineering"]
doi:10.7287/peerj.preprints.2670v1
14 downloads
92 views

Data mining is one of the main activities in bioinformatics, specifically to extract knowledge from massive data sets related with gene expression measurement, CNV, DNA strings, and others. A long array of methods are used to perform such task, ranging from the...

["Bioinformatics","Computational Biology","Algorithms and Analysis of Algorithms","Data Mining and Machine Learning","Optimization Theory and Computation"]
doi:10.7287/peerj.preprints.2635v1
40 downloads
90 views

In this paper a method for detection of image forgery in lossy compressed digital images known as error level analysis (ELA) is presented and it's noisy components are filtered with automatic wavelet soft-thresholding. With ELA, a lossy compressed image is recompressed...

["Artificial Intelligence","Computer Vision","Data Mining and Machine Learning","Graphics"]
doi:10.7287/peerj.preprints.2619v1
101 downloads
329 views

Software forges like GitHub host millions of repositories. Software engineering researchers have been able to take advantage of such a large corpora of potential study subjects with the help of tools like GHTorrent and Boa. However, the simplicity in querying comes...

["Data Mining and Machine Learning","Software Engineering"]
doi:10.7287/peerj.preprints.2617v1
37 downloads
231 views

Sparse coding is an effective operating principle for the brain, one that can guide the discovery of features and support the learning of assocations. Here we show how spiking neurons with discrete dendrites can learn sparse codes via an online, nonlinear Hebbian...

["Computational Biology","Adaptive and Self-Organizing Systems","Artificial Intelligence","Data Mining and Machine Learning"]
doi:10.7287/peerj.preprints.2595v1
What is a PeerJ Preprint?

A PeerJ Preprint is a draft of an article, abstract, or poster that has not yet been peer-reviewed for formal publication. Submit a draft, incomplete, or final version of your work for free.

Submissions today can be approved by Editorial Staff and online in 24 hours.

Establish precedent. Solicit feedback. Publish updates.

Top subject areas - Preprints

Top subject areas - People

View all subject areas