Assessing alignment-based taxonomic classification of ancient microbial DNA
- Published
- Accepted
- Subject Areas
- Bioinformatics, Evolutionary Studies, Microbiology
- Keywords
- Microbiome, Paleomicrobiology, Ancient DNA, Bioinformatics, Alignment, Taxonomic classification, Shotgun metagenomics, Microbiology
- Copyright
- © 2018 Eisenhofer et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. Assessing alignment-based taxonomic classification of ancient microbial DNA. PeerJ Preprints 6:e27166v1 https://doi.org/10.7287/peerj.preprints.27166v1
Abstract
The field of paleomicrobiology—the study of ancient microorganisms—is rapidly growing due to recent methodological and technological advancements. It is now possible to obtain vast quantities of DNA data from ancient specimens in a high-throughput manner and use this information to investigate the dynamics and evolution of past microbial communities. However, we still know very little about how the characteristics of ancient DNA influence our ability to accurately assign microbial taxonomies (i.e. identify species) within ancient metagenomic samples. Here, we use both simulated and published metagenomic data sets to investigate how ancient DNA characteristics affect alignment-based taxonomic classification. We find that nucleotide-to-nucleotide, rather than nucleotide-to-protein, alignments are preferable when assigning taxonomies to DNA fragment lengths routinely identified within ancient specimens (<60 bp). We determine that deamination (a form of ancient DNA damage) and random sequence substitutions corresponding to ~100,000 years of genomic divergence minimally impact alignment-based classification. We also test four different reference databases and find that database choice can significantly bias the results of alignment-based taxonomic classification in ancient metagenomic studies. Finally, we perform a reanalysis of previously published ancient dental calculus data, increasing the number of microbial DNA sequences assigned taxonomically by an average of 64.2-fold and identifying microbial species previously unidentified in the original study. Overall, this study enhances our understanding of how ancient DNA characteristics influence alignment-based taxonomic classification of ancient microorganisms and provides recommendations for future paleomicrobiological studies.
Author Comment
This is a submission to PeerJ for review.
Supplemental Information
Read length distribution of simulated metagenome mimicking commonly observed fragment length distribution of ancient DNA
Genus-level taxonomic assignments of simulated metagenomes
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Species-level taxonomic assignments of simulated metagenomes
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Influence of heavy deamination on taxonomic assignment at species-level using empirical ancient DNA fragment length distribution metagenome
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Influence of deamination on taxonomic assignment at genus-level for all read length metagenomes
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Influence of deamination on taxonomic assignment at species-level for all read length metagenomes
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Influence of divergence and heavy deamination on taxonomic classification at genus-level on empirical ancient DNA fragment length distribution metagenome
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Influence of divergence and heavy deamination on taxonomic classification at species-level on empirical ancient DNA fragment length distribution metagenome
Taxa coloured black were not used as input for constructing the simulated metagenomes and represent misclassifications.
Read length distribution of simulated metagenome, MALTn-genome aligned reads, and unaligned reads for the 1,000ky divergence simulation
Species-level classification of the Chimpanzee sample using different reference databases
Species-level classification of the El Sidron1 Neanderthal using different reference databases
Species-level classification of the modern dental calculus sample using different reference databases
Species-level classification of the Spy II Neanderthal using different reference databases
Details and composition of simulated metagenome
Plaque community based on Mark-Welsh et al. 2016