This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Advances in sequencing technology have exponentially increased data-generating capabilities, and data analysis has now become the major hurdle in many research programs. As sequencing tools become more accessible and automated, experimental design and data analysis skills become the key factors in determining the success of a study. However, proper bioinformatic analysis also relies on a deep understanding of laboratory workflow, in order to prevent biases in the data. This is particularly true if commercial kits are used, as proprietary reagents frequently obfuscate underlying reactions and their conditions. Here we present a training module that seamlessly combines laboratory components (experimental evolution of T5 bacteriophage resistance by Escherichia coli,and library preparation), with bioinformatic analysis of the resulting data. Students conduct a simple genetic variant discovery experiment in the course of about a week. The module uses mature Illumina chemistry for both library preparation and sequencing, though it can be modified for use with any sequencing platform. Because most students do not use Linux, the bioinformatic pipeline is available inside a cross-platform virtual machine, simplifying software installation, and providing a non-threatening introduction to the command line. The analysis, which is made simpler by the fact that most resistance mutations occur in one gene, making them easier to find, emphasizes the potential pitfalls of using short-read data for mutational analysis, and explores biases inherent to the methodology. This module can fill an existing training gap in advanced undergraduate, or early graduate education, allowing student to experience first-hand design, execution, and analysis of next-generation sequencing experiments.
This is a preprint submission to PeerJ. It will be eventually submitted for peer review.