GenHap: A novel computational method based on genetic algorithms for haplotype assembly
Author and article information
Abstract
The process of inferring a full haplotype of a cell is known as haplotyping, which consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. In this work, we propose a novel computational method for haplotype assembly based on Genetic Algorithms (GAs), named GenHap. Our approach could efficiently solve large instances of the weighted Minimum Error Correction (wMEC) problem, yielding optimal solutions by means of a global search process. wMEC consists in computing the two haplotypes that partition the sequencing reads into two unambiguous sets with the least number of corrections to the SNP values. Since wMEC was proven to be an NP-hard problem, we tackle this problem exploiting GAs, a population-based optimization strategy that mimics Darwinian processes. In GAs, a population composed of randomly generated individuals undergoes a selection mechanism and is modified by genetic operators. Based on a quality measure (i.e., the fitness value), inspired by Darwin’s “survival of the fittest” laws, each individual is involved in a selection process.
Our preliminary experimental results show that GenHap is able to achieve correct solutions in short running times. Moreover, this approach can be used to compute haplotypes in organisms with different ploidity. The proposed evolutionary technique has the advantage that it could be formulated and extended using a multi-objective fitness function taking into account additional insights, such as the methylation patterns of the different chromosomes or the gene proximity in maps achieved through Chromosome Conformation Capture (3C) experiments.
Cite this as
2017. GenHap: A novel computational method based on genetic algorithms for haplotype assembly. PeerJ Preprints 5:e3246v1 https://doi.org/10.7287/peerj.preprints.3246v1Author comment
This is an abstract which has been accepted for the NETTAB 2017 Workshop
Sections
Additional Information
Competing Interests
The authors declare that they have no competing interests.
Author Contributions
Andrea Tangherloni conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper.
Simone Spolaor conceived and designed the experiments, analyzed the data, wrote the paper, performed the computation work, reviewed drafts of the paper.
Leonardo Rundo analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Marco S Nobile reviewed drafts of the paper.
Ivan Merelli conceived and designed the experiments, contributed reagents/materials/analysis tools, reviewed drafts of the paper.
Paolo Cazzaniga reviewed drafts of the paper.
Daniela Besozzi reviewed drafts of the paper.
Giancarlo Mauri reviewed drafts of the paper.
Pietro Liò reviewed drafts of the paper.
Data Deposition
The following information was supplied regarding data availability:
The code of this work is still under development.
Funding
The authors received no funding for this work.