Local ancestry prediction with PyLAE

View article
Loading...
Bioinformatics and Genomics

Main article text

 

Introduction

Materials and Methods

Data source

Data pre-processing

Admixture

Phasing

Two modes

Local ancestry bayesian approach (PyLAE)

Therefore

Application of local ancestry

  • (1) Find the number of (non)synonymous SNPs in groups A and B.

  • (a) Let I represent the total number of studied pathways, and i = 1,…, I, be the number of the (non) synonymous SNP per ith pathway are nS(i) and nA(i). The expected fraction of (non) synonymous SNPs in individuals from group A is given by p=nAnB+nA, where nA is the amount of (non) synonymous SNPs in all KEGG pathways found in group A, nB is the amount of (non) synonymous SNPs in all KEGG pathways found in group B. The fraction pi of A (non)synonymous SNPs in the ith KEGG pathway is pi=nA(i)nB(i)+nA(i).

  • (2) The enrichment D(N)SE scores are computed for every pathway with continuity correction:

  • (a) D(N/S)SEScore=(ppi)±12(nS(i)+nA(i))p(1p)(nS(i)+nA(i))

  • (3) P-values are calculated using Bonferroni and Benjamini–Hochberg corrections and used to identify differentially enriched pathways. A pathway is considered to be differentially enriched if the adjusted P-value < 0.005 (Benjamin et al., 2018).

  • (4) To consider the excess of synonymous SNPs over nonsynonymous SNPs, we calculate enrichment scores for synonymous SNPs, DSSE. To be considered significant, the P-value of the nonsynonymous test is required to be below the corresponding P-value of the synonymous test for each pathway.

Using PyLAE with different genomes and/or sets of markers

Results

Investigation of the reference dataset

Admixture profiles in Diploid vs. Haploid modes

Population-specific accuracy limitations of the admixture-based approach

Performance of the PyLAE algorithm

Application of local ancestry

Availability and requirements

Additional Information and Declarations

Competing Interests

Nikita Moshkov is a part-time bioinformatician for Atlas Biomed.

Tatiana Tatarinova is an academic editor for PeerJ.

Author Contributions

Nikita Moshkov analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Aleksandr Smetanin performed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.

Tatiana V. Tatarinova conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The data is available at GitHub: https://github.com/smetam/pylae

Funding

The authors received no funding for this work.

1 Citation 1,625 Views 256 Downloads