NOT PEER-REVIEWED
"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

A peer-reviewed article of this Preprint also exists.

View peer-reviewed version

Supplemental Information

Analysis by groups of the CRISPR-like loci inserted in the VlpC gene

Alignment performed with Muscle software, using NC_000915 strains as reference genome (firstline). Repeated direct sequence (DR). Pairwise % Identity in the gene: 92%, CRISPR-like loci 86%.

DOI: 10.7287/peerj.preprints.27196v1/supp-1

Analysis by groups of the CRISPR-like loci inserted in the VlpC gene

Alignment performed with Muscle software, using J99 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 92%, CRISPR-like loci 79%. Two large deletions are observed that affect the spacer number one as well as the sequence DR number two. It also highlights the complete deletion of spacer number two.

DOI: 10.7287/peerj.preprints.27196v1/supp-2

Analysis by groups of the CRISPR-like loci inserted in the VlpC gene

Alignment performed with Muscle software, using J99 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 91%, CRISPR-like loci 77%. A duplication of the sequence DR number one as of the spacer number one is observed, which is traded in a deletion in the sequence used as a reference

DOI: 10.7287/peerj.preprints.27196v1/supp-3

Analysis by groups of the CRISPR-like loci inserted in the VlpC gene

Alignment performed with Muscle software, using J99 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 90%, CRISPR-like loci 85%. it is observed that the spacers number one and number two present different points with deletions.

DOI: 10.7287/peerj.preprints.27196v1/supp-4

Analysis by groups of the CRISPR-like loci inserted in the VlpC gene

Alignment performed with Muscle software, using J99 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 90%, CRISPR-like loci 54%. Two duplications of DR number two are observed as well as of spacer number two in the Puno135 strain.

DOI: 10.7287/peerj.preprints.27196v1/supp-5

Analysis by groups of the CRISPR-like loci inserted in the VlpC gene

Alignment performed with Muscle software, using J99 strains as reference genome (first line).Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 94%, CRISPR-like loci 85%. Different points are observed with deletions along the CRISPR-like loci

DOI: 10.7287/peerj.preprints.27196v1/supp-6

Alignment of the sequences obtained from blastn for the CRISPR-like loci detected in Shi417 and Shi112 strains

The color scale in the alignment indicates the degree of variation in both the gene (hypothetical protein) and its CRISPR-like loci. Dark > Pairwise % Identity, clear < Identity % Pairwise, respectively. Alignment performed with Muscle software, using Shi417 and Shi112 strains as reference genome (first and second line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 72%, CRISPR-like loci 56%. Strains Aklavik86, Aklavik117 and P12 showed the truncated 5 'region.

DOI: 10.7287/peerj.preprints.27196v1/supp-7

Phylogenetic tree constructed CRISPR-like detected in the gene coding for a hypothetical protein

A phylogeographic differentiation of CRISPR-like loci is observed. Analysis executed with the MEGA7 software. The evolutionary distance scales in 0.02 Jukes-Cantor model. (A) Group of African and European geographical origin. (B) Amerind geographic group.

DOI: 10.7287/peerj.preprints.27196v1/supp-8

Analysis of strains Shi470 and BM012A

(A) The alignment with Mauve revealed that, the Poly-E rich protein gene was in a region near the breaking point of an inversion that affects these strains. (B) Alignment performed with Muscle software. Repeated direct sequence (DR). Continuous line indicates the presence of gaps. In the alignment it is observed that, the differences observed can be explained by the number of DR sequences and spacers in which they differ

DOI: 10.7287/peerj.preprints.27196v1/supp-9

Alignment and blastn for the CRISPR-like loci detected in Shi470 and BM012A strains

The color scale in the alignment indicates the degree of variation in both the Poly-E rich protein gene and its CRISPR-like loci. Dark > Pairwise % Identity, clear < Pairwise % Identity, respectively. Alignment performed with Muscle software, using Shi470 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 80%, CRISPR-like loci 60%. The alignment revealed an average location of the CRISPR-like loci, showing a high variability for it

DOI: 10.7287/peerj.preprints.27196v1/supp-10

Phylogenetic tree constructed with the CRISPR-like detected in the Poly-E rich

Phylogenetic tree constructed with the 35 CRISPR-like detected in the gene coding for a Poly-E rich protein. A phylogeographic differentiation of CRISPR-like loci is observed. Analysis executed with the MEGA7 software. The evolutionary distance scales is 0.02 Jukes-Cantor model. (A) Group o fAfrican and European geographical origin.(B) Asia geographic group.(C) Amerind group.

DOI: 10.7287/peerj.preprints.27196v1/supp-11

Alignment and blastn for the CRISPR1-like loci detected in SJM180 strain

Alignment performed with Muscle software, using SJM180 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaeps. Pairwise % Identity in the gene: 85%, CRISPR-like loci 63%. The alignment revealed an average location of the CRISPR-like loci. In the alignment the degeneration of the CRISPR1-like loci is observed and while the 5 'and 3' regions show a high degree of variability

DOI: 10.7287/peerj.preprints.27196v1/supp-12

Phylogenetic tree constructed with the CRISPR1-like detected in the SJM180 strain

Phylogenetic tree constructed with the 39 CRISPR1-like detected in the SJM180 strain with the gene coding for a hypothetical protein. A phylogeographic differentiation of CRISPR-like loci is not observed. Analysis executed with the MEGA7 software.The evolutionary distance scales is 0.01 Jukes-Cantor model

DOI: 10.7287/peerj.preprints.27196v1/supp-13

Alignment and blastn for the CRISPR3-like loci detected in SJM180 strain

Alignment performed with Muscle software, using SJM180 strains as reference genome (first line). Repeated direct sequence (DR). Continuous line indicates the presence of gaps. Pairwise % Identity in the gene: 84%, CRISPR-like loci 61%. The alignment revealed an average location of the CRISPR-like loci. In the alignment the degeneration of the CRISPR1-like loci is observed and while the 5 'and 3' regions show a high degree of variability.

DOI: 10.7287/peerj.preprints.27196v1/supp-14

Phylogenetic tree constructed with the CRISPR3-like detected in the SJM180 strain

Phylogenetic tree constructed with the 39 CRISPR3-like detected in the SJM180 strain with the gene coding for a hypothetical protein. A phylogeographic differentiation of CRISPR-type loci is not observed. Analysis executed with the MEGA7 software. The evolutionary distance scales in 0.01 model of Jukes-Cantor

DOI: 10.7287/peerj.preprints.27196v1/supp-15

Table summary spacer and repeated sequences

Characteristics of the repeated direct sequences (DRs) consensus and of the spacer sequences of the 22 CRISPR-like identified with CRISPRFinder

DOI: 10.7287/peerj.preprints.27196v1/supp-16

Analysis CRISPRTarget

Analysis of the spacers identified by CRISPRFinder to determine their similarity with foreign genetic elements.

DOI: 10.7287/peerj.preprints.27196v1/supp-17

Blast cDNA for the VlpC gene

Blast analysis with the cDNA for the VlpC gene. Analysis of the cDNA showed that this gene was expressed in 50 of the 52 strains that had this gene. This gene was not expressed only in the South Africa20 and Shi470 strains. The e-value used was 10e-5

DOI: 10.7287/peerj.preprints.27196v1/supp-18

Identification of cas domains

The e-value used for the searches was 10e-5 through hmmscan. E-value: domain reliability; c-Evalue: reliability for this particular domain; acc: average probability of the aligned residuals. Measure of reliability of the alignment from 0 to 1, where 1.00 indicates that the alignment is completely reliable

DOI: 10.7287/peerj.preprints.27196v1/supp-19

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Jerson Alexander Garcia-Zea conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Roberto de la Herrán conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Francisca Robles Rodríguez prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Rafael Navajas-Pérez prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Carmelo Ruiz Rejón conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Data Deposition

The following information was supplied regarding data availability:

Group analysis of the CRISPR-like loci inserted in the VlpC gene are shown in supplementary file one. Alignments and analysis of CRISPR-like detected in genes other than VlpC are shown in supplementary files two, four, five, seven and nine. Phylogenies of CRISPR-like detected in genes other than the VlpC gene are shown in supplementary files three, six, eight and, ten.

Funding

The authors received no funding for this work.


Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)
 
By posting this you agree to PeerJ's commenting policies
  Visitors   Views   Downloads