This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
The volume of collected genetic data has been growing exponentially in the past few years and we need to improve the way we store, analyze and visualize it in order to be able to draw relevant conclusions that could improve the life quality of people. Extracting patterns and predicting future mutations and their impact will rely heavily on the efficient use of Big Data. Often a mutation on its own cannot provide enough information about a disorder or disease. Only if we combine the genetic information with the organism’s environment we can draw some conclusions about penetrance and expressively of the mutation. Because many genes can cause a single disease and at the same time a single gene can cause multiple diseases, we need to analyze the whole context of a person.
In this work, a distributed solution that provides demographics and metrics about diagnostics and mutations is pro posed. Seeing the occurrence of a mutation in a particular geographic region can help medical special ists narrow down the search for a patient’s mutations without sequencing the whole genome.
This is an abstract which has been accepted for the NETTAB 2017 Workshop