NOT PEER-REVIEWED
"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

A peer-reviewed article of this Preprint also exists.

View peer-reviewed version

Supplemental Information

Compositional variance as a function of CDR/MDR length

Tetranucleotide frequency χ2 values were calculated for each CDR and MDR as described in the Methods and plotted against the length of each region. HL-46 serves as a control; having less than half of its genome represented in the reconstruction, CDRs (blue) and MDRs (green) should be indistinguishable. A general negative relationship is observed, however, experimental CDRs (red) show a strong relationship, whereas MDRs (yellow) are poorly explained by this model, suggesting variant sequence comprises a greater proportion of MDRs.

DOI: 10.7287/peerj.preprints.2953v1/supp-1

Analysis summaries

Representation of summarized data for all genomes. Data depiction is as described for Figure 2.

DOI: 10.7287/peerj.preprints.2953v1/supp-2

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

William C Nelson conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Jennifer M Mobberley conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Data Deposition

The following information was supplied regarding data availability:

The raw sequence data is freely available from the NCBI and IMG databases. Analysis was performed using third party, open source software and databases. References describing how to acquire data and software are included in the Materials and Methods section of the manuscript.

Funding

This work was supported by the U.S. Department of Energy (DOE), Office of Biological and Environmental Research (BER), as part of BER’s Genomic Science Program (GSP). This contribution originates from the GSP Foundational Scientific Focus Area (FSFA) at the Pacific Northwest National Laboratory (PNNL). The Pacific Northwest National Laboratory is operated for DOE by Battelle Memorial Institute under contract DE-AC05-76RL01830. Sequence data presented was generated at the DOE Joint Genome Institute under contract no. DE-AC02-05CH11231 and Community Science Project 701. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)
 
By posting this you agree to PeerJ's commenting policies
  Visitors   Views   Downloads