TOXIFY: a deep learning approach to classify animal venom proteins

T Jeffrey Cole; Michael S Brewer

doi:10.7287/peerj.preprints.27498v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

A peer-reviewed article of this Preprint also exists.

View peer-reviewed version

TOXIFY: a deep learning approach to classify animal venom proteins

T Jeffrey Cole , Michael S Brewer

Department of Biology, East Carolina University, Greenville, NC, United States

DOI: 10.7287/peerj.preprints.27498v1

Published: 2019-01-22
Accepted: 2019-01-22

Subject Areas: Bioinformatics, Computational Biology, Genomics
Keywords: Venom, Deep Learning, Protein Classification, Transcriptome, Proteome

Copyright: © 2019 Cole et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Cole TJ, Brewer MS. 2019. TOXIFY: a deep learning approach to classify animal venom proteins. PeerJ Preprints 7:e27498v1 https://doi.org/10.7287/peerj.preprints.27498v1

Abstract

In the era of Next-Generation Sequencing and shotgun proteomics, the sequences of animal toxigenic proteins are being generated at rates exceeding the pace of traditional means for empirical toxicity verification. To facilitate the automation of toxin identification from protein sequences, we trained Recurrent Neural Networks with Gated Recurrent Units on publicly available datasets. The resulting models are available via the novel software package TOXIFY, allowing users to infer the probability of a given protein sequence being a venom protein. TOXIFY is more than 20X faster and uses over an order of magnitude less memory than previously published methods. Additionally, TOXIFY is more accurate, precise, and sensitive at classifying venom proteins.

Availability: https://www.github.com/tijeco/toxify

Author Comment

This is a submission to PeerJ for review.

0

2569 days ago - Jason Macrander

Hello,

I am very much looking forward to giving this a try, but I noticed that you referenced Venomix as a time-consuming pipeline. We designed it specifically not to be one. From our manuscript:

"Venomix was tested on the University of North Carolina at Charlotte COPPERHEAD Research Computing Cluster, while requesting computational resources that may mimic most personal laptops/desktops, specifically one processor and 4 GB of RAM. Using these settings the Venomix completed in less than twenty minutes for each of the focal transcriptome."

If you have encountered any errors or longer than anticipated wait times when you ran Venomix please let me know.

Best,

~ Jason

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

0

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article