Identifying Genetic Interactions Associated with Late-Onset Alzheimer’s Disease
- Published
- Accepted
- Subject Areas
- Bioinformatics, Computational Biology, Genomics, Cognitive Disorders, Neurology
- Keywords
- Bayesian networks, Alzheimer’s disease, genome-wide association study, epistasis, genetic interaction
- Copyright
- © 2013 Floudas et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- Cite this article
- 2013. Identifying Genetic Interactions Associated with Late-Onset Alzheimer’s Disease. PeerJ PrePrints 1:e123v2 https://doi.org/10.7287/peerj.preprints.123v2
Abstract
Background Identifying genetic interactions in data obtained from genome-wide association studies (GWASs) can help in understanding the genetic basis of complex diseases. The large number of single nucleotide polymorphisms (SNPs) in GWASs however makes the identification of genetic interactions computationally challenging. We developed the Bayesian Combinatorial Method (BCM) that can identify pairs of SNPs that in combination have high statistical association with disease. Results We applied BCM to two late-onset Alzheimer’s disease (LOAD) GWAS datasets to identify SNP-SNP interactions between a set of known SNP associations and the dataset SNPs. For evaluation we compared our results with those from logistic regression, as implemented in PLINK. Gene Ontology analysis of genes from the top 200 dataset SNPs for both GWAS datasets showed overrepresentation of LOAD-related terms. Four genes were common to both datasets: APOE and APOC1, which have well established associations with LOAD, and CAMK1D and FBXL13, not previously linked to LOAD but having evidence of involvement in LOAD. Supporting evidence was also found for additional genes from the top 30 dataset SNPs. Conclusion BCM performed well in identifying several SNPs having evidence of involvement in the pathogenesis of LOAD that would not have been identified by univariate analysis due to small main effect. These results provide support for applying BCM to identify potential genetic variants such as SNPs from high dimensional GWAS datasets.
Author Comment
This is version 2. Changes to the test include:-in the Abstract, a clarification about the method being used for comparison (lines 32-33, line 252, "we compared our results with those from logistic regression, as implemented in PLINK"),- a more detailed explanation of how the two APOE SNPs determine the APOE protein polymorphism (lines 143-147).
Supplemental Information
Top scoring 200 SNP-BN models in the ADRC dataset. (MA: minor allele, Chr: chromosome number)
Top scoring 200 SNP-BN models in the TGen dataset. (MA: minor allele, Chr: chromosome number)
Significantly overrepresented GO terms related to the genes in the top 200 dataset SNPs in ADRC dataset.
MF: GO molecular function; CC: GO cellular compartment; BP: biological process, No: number of genes from the list that have the relevant annotation. . Italicized entries represent the redundant terms; they are placed under their most informative common ancestor (in normal font).
Significantly overrepresented GO terms related to the genes in the top 200 dataset SNPs in the TGen dataset.
MF: GO molecular function; CC: GO cellular compartment; BP: biological process, No: number of genes from the list that have the relevant annotation. . Italicized entries represent the redundant terms; they are placed under their most informative common ancestor (in normal font).