Visitors   Views   Downloads

Genome-wide discovery of local RNA structural elements in Zika virus

View preprint
160 days ago
Genome-wide discovery of local RNA structural elements in Zika virus https://t.co/2CvzBdnPij
Genome-wide discovery of local RNA structural elements in Zika virus https://t.co/0dJt3PeRdg In addition to encoding RNA primary structures, genomes also encode RNA secondary and tertiary structures that play roles in gene regulation and, in the case of RNA viruses, ge…
NOT PEER-REVIEWED
"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

A peer-reviewed article of this Preprint also exists.

View peer-reviewed version

Supplemental Information

Figure S1. Comparative arc diagram depicting the previously described 5' RNA structural motifs vs. the ScanFold predicted bps

(a) Arc diagram of 5' end region as predicted via ScanFold; base pairs are colored based on their z-score cutoff where blue lines depict bps which were predicted in the z-score < -2 results (Table S7) and green lines refer to bps which were predicted in the z-score < -1 results (Table S6). (b) Arc diagram of the accepted secondary structure model for the 5' end of ZIKV as shown in (Ye et al. 2016) and mapped to the KJ776791.2 sequence.

DOI: 10.7287/peerj.preprints.27101v1/supp-1

Figure S2. Comparative arc diagram depicting the known RNA structural motifs vs. the ScanFold predicted bps

(a) Arc diagram of 3' end region as predicted via ScanFold; base pairs are colored based on their z-score cutoff where blue lines depict bps which were predicted in the z-score < -2 results (Table S7), green lines refer to bps which were predicted in the z-score < -1 results (Table S6), and yellow lines were predicted in the no filter results (Table S5). (b) Arc diagram of the accepted secondary structure model for the 5’ end of ZIKV as shown in (Goertz et al. 2017) mapped to the KJ776791.2 sequence. The start codon nucleotide locations have been highlighted with a light blue bar.

DOI: 10.7287/peerj.preprints.27101v1/supp-2

Figure S3. Secondary structure model depicting the ScanFold proposed structures within and directly adjacent to known 5' and 3' structured regions

Base pairs are colored based on their z-score cutoff: blue lines depict bps which were predicted in the z-score < -2 results (Table S7), green lines refer to bps which were predicted in the z-score < -1 results (Table S6), and yellow lines were predicted in the no filter results (Table S5). The start and stop codon nucleotides have been circled and labeled in blue and green respectively. Nucleotides which established ScanFold bp preserving mutations within the alignment are highlighted with filled green circles.

DOI: 10.7287/peerj.preprints.27101v1/supp-3

Table S1. Results of the scanning window analysis of the the ZIKV genome (NCBI accession KJ776791.2) as ouput from the ScanFold-Scan program

Each row contains the data calculated for each window. Columns A and B are the starting (i) and ending (j) coordinates of the window fragment. Column C is the temperature used for all RNAFold calculations. Column D-H refer to the ∆Gnative, thermodynamic z-score, stability ratio p-value, ensemble diversity, and f requency-of- MFE (fMFE) values respectively (detailed descriptions of all metrics can be found at the RNAStructuromeDB https://structurome.bb.iastate.edu or the corresponding manuscript (Andrews et al. 2017) ). Column I contains the sequence of the window; the ∆Gnative and centroid structure of this sequence are shown in Column J and K. Column L-O report nucleotide counts for the window sequence.

DOI: 10.7287/peerj.preprints.27101v1/supp-4

Table S2. ScanFold log file produced during the ScanFold-Fold portion of the program

The log file is separated into two portions. The first half (row 1 to 87,448) contains a table for each nucleotide in the sequence. These tables contain the cumulative base pairing information for that nucleotide as predicted throughout the scan. Column A refers to the i-nucleotide of the sequence. Column B refers to the coordinate of the j base pair. The total number of windows the i-j pair appears, as well as the total number of windows the i-nucleotide appears are reported in column D. The average window minimum free energy, z-score, and ensemble diversity of each i-j pair are reported in columns E-G respectively. Column H reports the sum of z-scores for each i-j pair, which is used to calculate the coverage-normalized z-score (calculated as the sum of z-score over total windows in which i-nucleotide appeared) as reported in Column I. Column J reports a summary of the bps predicted for each i-­nucleotide. The second half of the log file, starting at row 87,449, is a list of the most favorable i-j pairs (column B and C) associated with the i-nucleotide listed in column A. In places where this nucleotide competed with other i-nucleotides for the same j-nucleotide, the “winning” i-j pair is reported and denoted with an asterisk (in some cases the winning i-j pair does not contain the original i-nucleotide or may be unpaired). Columns D, E, and F, contain the average window minimum free energy, z-score and ensemble diversity for the corresponding i-j pair.

DOI: 10.7287/peerj.preprints.27101v1/supp-5

Table S3. Results of 37 ZIKV genomes curated in the ZikaVR database (Gupta et al. 2016) aligned to KJ776791.2

Genomes were aligned using the MAFFT web server (Katoh et al. 2017; Kuraku et al. 2013) with default settings. Headings for each result contain the NCBI accession numbers and name of the aligned sequence name.

DOI: 10.7287/peerj.preprints.27101v1/supp-6

Table S4. Base pair counts tabulating the number and type of base pair which appears in the ScanFold < -1 predicted structure when compared to 37 aligned ZIKV genome

A total of 37 ZIKV genomes were aligned to KJ776791.2 using the MAFFT web server (Katoh et al. 2017; Kuraku et al. 2013) using default settings. Aligned sequences were compared to ScanFold-Fold predicted bps (with z-score < -1) to tabulate the types of base pairs which are found throughout the alignment (Table S3). Column S reports the percent of canonical bps which were found to be allowed throughout the alignment for that base pair and column T reports the different number of canonical base pair types. Results for the previously reported 5' and 3' UTR structural regions appear as separate worksheets.

DOI: 10.7287/peerj.preprints.27101v1/supp-7

Table S5. CT file of the default no-filter results output from ScanFold-Fold

DOI: 10.7287/peerj.preprints.27101v1/supp-8

Table S6. CT file of the default z-score < -1 results output from ScanFold-Fold

DOI: 10.7287/peerj.preprints.27101v1/supp-9

Table S7. CT file of the default z-score < -2 results output from ScanFold-Fold

DOI: 10.7287/peerj.preprints.27101v1/supp-10

Table S8. RNAstructure webserver scorer results of the ScanFold predicted structures compared to accepted structures

The structures and sequences were uploaded to the server as shown, and scorer was run with default settings.

DOI: 10.7287/peerj.preprints.27101v1/supp-11

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Ryan J Andrews conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Julien Roche conceived and designed the experiments, authored or reviewed drafts of the paper, approved the final draft.

Walter N Moss conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Data Deposition

The following information was supplied regarding data availability:

GitHub - https://github.com/moss-lab/ScanFold

RNAStructuromeDB - https://structurome.bb.iastate.edu/

Funding

This work was supported by startup funds from the Iowa State University College of Agriculture and Life Sciences and the Roy J. Carver Charitable Trust, as well as grant R00GM112877 from the NIH/NIGMS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)
 
By posting this you agree to PeerJ's commenting policies