3 downloads
83 views

Background Software maintenance is an important activity in the development process where maintenance team members leave and new members join over time. The identification of files which are changed together frequently has been proposed several times. Yet, existing...

["Data Science","Software Engineering"]
doi:10.7717/peerj-cs.135
7 downloads
64 views

Ecological niche modeling (ENM) is increasingly being used in studying the relationship between species distributions and environmental conditions. The development of ENM software/algorithms is heading toward open-source programming, for the advantage of efficiency...

["Biogeography","Bioinformatics","Ecology","Zoology","Data Science"]
doi:10.7287/peerj.preprints.3346v1
3 downloads
17 views

The conditional mutual information \(I(X;Y|Z)\) measures the average information that X and Y contain about each other given Z. This is an important primitive in many learning problems including conditional independence testing, graphical model inference, causal...

["Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.3345v1
171 downloads
433 views

Computational models in biology encode molecular and cell biological processes. These models often can be represented as biochemical reaction networks. Studying such networks, one is mostly interested in systems that share similar reactions and mechanisms. Typical...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning","Data Science"]
doi:10.7287/peerj.preprints.1479v3
5 downloads
68 views

It has been estimated that up to 80% of all data stored in health care databases may have spatial components. To fully exploit such components, there is a need of improving existing tools or developing novel spatio-temporal functionalities. Geographic information...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.3335v1
25 downloads
104 views

Nowadays, a huge amount of biomedical data of different biological entities is provided by many online databases and services, each with its own data model, user interface and query language. However, typical bioinformatics scenarios require the use of more than...

["Bioinformatics","Data Science"]
doi:10.7287/peerj.preprints.3309v1
25 downloads
56 views

We present GenotypeAnalytics (GA), a RESTFul service that makes it possible to mine association rules from Single Nucleotide Polymorphism (SNP) datasets using standard web browsers. GA can speed up and simplify the analysis of this massive amount of data, highlighting...

["Bioinformatics","Computational Biology","Data Mining and Machine Learning","Data Science","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.3299v1
17 downloads
70 views

REDCap (Research Electronic Data Capture) is one of the most popular web-based applications to support data capture for research studies and registries. i2b2 (Informatics for Integrating Biology and the Bedside) is a widely adopted data warehouse to re-use clinical...

["Data Science","Databases"]
doi:10.7287/peerj.preprints.3294v1
1,378 downloads
4,241 views

Forecasting is a common data science task that helps organizations with capacity planning, goal setting, and anomaly detection. Despite its importance, there are serious challenges associated with producing reliable and high quality forecasts — especially when...

["Data Science"]
doi:10.7287/peerj.preprints.3190v2
34 downloads
344 views

There are few truly bad ideas in authentic science. We need to embrace science as a process- driven human endeavour to better understand the world around us. Products are important, but through better transparency, we can leverage ideas, good and bad, ours and...

["Ecology","Human-Computer Interaction","Data Science"]
doi:10.7287/peerj.preprints.3282v1
84 downloads
302 views

Background. Artificial enrichment of lakes has posed serious management problems for water supply. In results many European lakes had already undergone significant eutrophication. It seems that a good tool to determine the influence of catchment use on the trophic...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.2203v2
79 downloads
376 views

Face to the urban resiliency two major environmental threats are widely recognized: the increasing summer air temperatures and the soil consumption that affects a large number of city in Italy. The work have the goal to present preliminary the actual Heat Summer...

["Data Science","Spatial and Geographic Information Systems"]
doi:10.7287/peerj.preprints.2234v2
116 downloads
496 views

Severe weather impact identification and monitoring through social media data is a good challenge for data science. In last years we assisted to an increase of natural disasters, also due to climate change. Many works showed that during such events people tend...

["Data Science","Emerging Technologies","Natural Language and Speech","Network Science and Online Social Networks"]
doi:10.7287/peerj.preprints.2241v2
22 downloads
183 views

Bacterial surfaces are complex, built of from membranes, peptide-glycans and, importantly, proteins. The proteins play crucial roles as the key regulator of how the bacterium interacts with its environment. A full catalog of the motifs in coiled-coil proteins and...

["Bioinformatics","Infectious Diseases","Data Science"]
doi:10.7287/peerj.preprints.3118v2
127 downloads
952 views

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based...

["Computer Networks and Communications","Data Science","Distributed and Parallel Computing","Security and Privacy","World Wide Web and Web Science"]
doi:10.7287/peerj.preprints.3194v2

Top subject areas - Articles & Preprints

Top subject areas - People

View all subject areas