Genes of the Pig, Sus scrofa, reconstructed with EvidentialGene
- Published
- Accepted
- Subject Areas
- Agricultural Science, Bioinformatics, Genomics, Data Science
- Keywords
- precision genomics, transcriptome assembly, model organism, biomedical genomics, agricultural genomics, genome informatics pipeline
- Copyright
- © 2018 Gilbert
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. Genes of the Pig, Sus scrofa, reconstructed with EvidentialGene. PeerJ Preprints 6:e27191v1 https://doi.org/10.7287/peerj.preprints.27191v1
Abstract
The pig is a well studied model animal of biomedical and agricultural importance. Genes of this species, Sus scrofa, are known from experiments and predictions, and collected at the NCBI Reference Sequence database section. Gene reconstruction from transcribed gene evidence of RNA-seq now can accurately and completely reproduce the biological gene sets of animals and plants. Such a gene set for the pig is reported here, including human orthologs missing from RefSeq and other improvements to the current NCBI pig gene set. Methodology for accurate and complete gene set reconstruction from RNA is used: the automated SRA2Genes pipeline of EvidentialGene project.
Author Comment
This is a submission to PeerJ for review.
Supplemental Information
pig18evg_datadesc.pages Conserved vertebrate genes recovered in Pig Evigene vs NCBI gene sets, as computed with vertebrate conserved genes of OrthoDB
pig18evg_datadesc.pages Columns include gene ids of BUSCO_ID, Evigene_ID, and NCBI RefSeq ID. Other columns: Cmp, the qualitative comparison (evgain, same, evloss) of alignment difference; Diff, numeric difference in alignment score to conserved protein; dEvg-Ncb, the two alignment scores; BC, the BUSCO complete/fragment/missing quality score; and Product_Name, the vertebrate protein product.
pig18evg_datadesc.pages Human genes recovered in Pig Evigene vs NCBI gene sets, as computed with human and pig RefSeq and Evigene proteins and NCBI BLASTP
pig18evg_datadesc.pages Columns include gene ids for Human RefSeq ID, Evigene_pig_ID, NCBI_pig_ID; AAsize, human protein size; EvAlign, NcAlign, alignment scores to Evigene and NCBI proteins; DiffA, difference in alignments; and Human_Gene_Name.