63 downloads
115 views

Finding useful patterns in datasets has attracted considerable interest in the field of visual analytics. One of the most common tasks is the identification and representation of clusters. However, this is non-trivial in heterogeneous datasets since the data needs...

["Data Science","Visual Analytics"]
doi:10.7287/peerj.preprints.3448v1
336 downloads
862 views

The sharing and re-use of data has become a cornerstone of modern science. Multiple platforms now allow quick and easy data sharing. So far, however, data publishing models have not accommodated on-going scientific improvements in data: for many problems, datasets...

["Computational Biology","Ecology","Computational Science","Data Science"]
doi:10.7287/peerj.preprints.3401v1
49 downloads
333 views

In translational medicine, the technology of RNA sequencing (RNA-seq) continues to prove powerful, and transforming the RNA-seq data into biological insights has become increasingly imperative. We present the Transcriptomics profiler for Easy Discovery (TED) toolkit,...

["Bioinformatics","Genomics","Translational Medicine","Computational Science","Data Science"]
doi:10.7287/peerj.preprints.3385v1
73 downloads
573 views

There are few truly bad ideas in authentic science. We need to embrace science as a process- driven human endeavour to better understand the world around us. Products are important, but through better transparency, we can leverage ideas, good and bad, ours and...

["Ecology","Human-Computer Interaction","Data Science"]
doi:10.7287/peerj.preprints.3282v2
53 downloads
490 views

Background Software maintenance is an important activity in the development process where maintenance team members leave and new members join over time. The identification of files which are changed together frequently has been proposed several times. Yet, existing...

["Data Science","Software Engineering"]
doi:10.7717/peerj-cs.135
41 downloads
142 views

Ecological niche modeling (ENM) is increasingly being used in studying the relationship between species distributions and environmental conditions. The development of ENM software/algorithms is heading toward open-source programming, for the advantage of efficiency...

["Biogeography","Bioinformatics","Ecology","Zoology","Data Science"]
doi:10.7287/peerj.preprints.3346v1
43 downloads
93 views

The conditional mutual information \(I(X;Y|Z)\) measures the average information that X and Y contain about each other given Z. This is an important primitive in many learning problems including conditional independence testing, graphical model inference, causal...

["Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.3345v1
197 downloads
571 views

Computational models in biology encode molecular and cell biological processes. These models often can be represented as biochemical reaction networks. Studying such networks, one is mostly interested in systems that share similar reactions and mechanisms. Typical...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.1479v3
52 downloads
162 views

It has been estimated that up to 80% of all data stored in health care databases may have spatial components. To fully exploit such components, there is a need of improving existing tools or developing novel spatio-temporal functionalities. Geographic information...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.3335v1
60 downloads
166 views

Nowadays, a huge amount of biomedical data of different biological entities is provided by many online databases and services, each with its own data model, user interface and query language. However, typical bioinformatics scenarios require the use of more than...

["Bioinformatics","Data Science"]
doi:10.7287/peerj.preprints.3309v1
33 downloads
76 views

We present GenotypeAnalytics (GA), a RESTFul service that makes it possible to mine association rules from Single Nucleotide Polymorphism (SNP) datasets using standard web browsers. GA can speed up and simplify the analysis of this massive amount of data, highlighting...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning","Data Science","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.3299v1
27 downloads
103 views

REDCap (Research Electronic Data Capture) is one of the most popular web-based applications to support data capture for research studies and registries. i2b2 (Informatics for Integrating Biology and the Bedside) is a widely adopted data warehouse to re-use clinical...

["Data Science","Databases"]
doi:10.7287/peerj.preprints.3294v1
5,397 downloads
9,192 views

Forecasting is a common data science task that helps organizations with capacity planning, goal setting, and anomaly detection. Despite its importance, there are serious challenges associated with producing reliable and high quality forecasts — especially when...

["Data Science"]
doi:10.7287/peerj.preprints.3190v2
106 downloads
355 views

Background. Artificial enrichment of lakes has posed serious management problems for water supply. In results many European lakes had already undergone significant eutrophication. It seems that a good tool to determine the influence of catchment use on the trophic...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.2203v2
90 downloads
440 views

Face to the urban resiliency two major environmental threats are widely recognized: the increasing summer air temperatures and the soil consumption that affects a large number of city in Italy. The work have the goal to present preliminary the actual Heat Summer...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.2234v2

Top subject areas - Articles & Preprints

Top subject areas - People

View all subject areas