Exploring complex disease gene relationships using simultaneous analysis

Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT, United States
Division of Endocrinology, Department of Medicine, University of Vermont, Burlington, VT, USA
Center for Clinical and Translational Science, University of Vermont, Burlington, VT, United States
Department of Computer Science, University of Vermont, Burlington, VT, USA
DOI
10.7287/peerj.preprints.230v1
Subject Areas
Bioinformatics, Evolutionary Studies, Genomics
Keywords
Bioinformatics, phylogenetics, simultaneous analysis, alzheimer disease
Copyright
© 2014 Romano et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Romano JD, Tharp WG, Sarkar IN. 2014. Exploring complex disease gene relationships using simultaneous analysis. PeerJ PrePrints 2:e230v1

Abstract

The characterization of complex diseases remains a great challenge for biomedical researchers due to the myriad interactions of genetic and environmental factors. Adaptation of phylogenomic techniques to increasingly available genomic data provides an evolutionary perspective that may elucidate important unknown features of complex diseases. Here an automated method is presented that leverages publicly available genomic data and phylogenomic techniques. The approach is tested with nine genes implicated in the development of Alzheimer Disease, a complex neurodegenerative syndrome. The developed technique, implemented through a suite of Ruby scripts entitled “ASAP2,” first compiles a list of sequence-similarity based orthologues using PSI-BLAST and a recursive NCBI BLAST+ search strategy, then constructs maximum parsimony phylogenetic trees for each set of nucleotide and protein sequences, and calculates phylogenetic metrics (partitioned Bremer support values, combined branch scores, and Robinson-Foulds distance) to provide an empirical assessment of evolutionary conservation within a given genetic network. This study demonstrates the potential for using automated simultaneous phylogenetic analysis to uncover previously unknown relationships among disease-associated genes that may not have been apparent using traditional, single-gene methods. Furthermore, the results provide the first integrated evolutionary history of an Alzheimer Disease gene network and identify potentially important co-evolutionary clustering around components of oxidative stress pathways.

Supplemental Information

Table 1: The nine Alzheimer Disease genes used in the study.

DOI: 10.7287/peerj.preprints.230v1/supp-1

Table 2: The 34 species identified by ASAP2 using Alzheimer Disease gene queries.

DOI: 10.7287/peerj.preprints.230v1/supp-2

Table 3: Nucleotide PBS values for each internal node on nucleotide simultaneous analysis tree for each data partition.

DOI: 10.7287/peerj.preprints.230v1/supp-3

Table 4: Protein PBS values given for each internal node on protein simultaneous analysis tree for each data partition.

DOI: 10.7287/peerj.preprints.230v1/supp-4

Table 5: RF and RF′ distance between each pair of trees for nucleotide sequence data partitions.

DOI: 10.7287/peerj.preprints.230v1/supp-5

Table 6: RF and RF′ distance between each pair of trees for protein sequence data partitions.

DOI: 10.7287/peerj.preprints.230v1/supp-6

Table 7: RF and RF’ values between corresponding nucleotide and protein trees for each gene.

DOI: 10.7287/peerj.preprints.230v1/supp-7

Figure 1: Overview of ASAP2.

DOI: 10.7287/peerj.preprints.230v1/supp-8

Figure 2: Nucleotide Trees.

DOI: 10.7287/peerj.preprints.230v1/supp-9

Figure 4: Nucleotide Simultaneous Analysis Tree with Support Values.

DOI: 10.7287/peerj.preprints.230v1/supp-11

Figure 4: Protein Simultaneous Analysis Tree with Support Values.

DOI: 10.7287/peerj.preprints.230v1/supp-12

Figure 6a: Phylogenetic Network Based on Nucleotide Trees.

DOI: 10.7287/peerj.preprints.230v1/supp-13

Figure 6b: Phylogenetic Network Based on Protein Trees.

DOI: 10.7287/peerj.preprints.230v1/supp-14

All datafiles generated by ASAP2 for Alzheimer Disease.

DOI: 10.7287/peerj.preprints.230v1/supp-15