DeephageTP: a convolutional neural network framework for identifying phage-specific proteins from metagenomic sequencing data

View article
Bioinformatics and Genomics

Main article text

 

Introduction

Materials & Methods

Datasets

Protein sequence encoding

The CNN-based deep learning model

Evaluation metrics

Loss value computation

DeephageTP application on real metagenomic datasets

Alignment-based methods for comparison

Results

Prediction performance of the CNN-based model on the training dataset

Prediction performance of CNN-based model on mock metagenomic dataset

Application of framework DeephageTP on real metagenomic datasets

Discussion

Conclusions

Supplemental Information

The length distribution of the three protein sequences

DOI: 10.7717/peerj.13404/supp-1

The Venn diagrams of the prediction results of three methods (i.e., DeephageTP, Diamond, and HMMER) on the metagenomic datasets

ERR2868024: A(TerL), B(Portal), C(TerS); SRR7892426: D(TerL), E(Portal), F(TerS).

DOI: 10.7717/peerj.13404/supp-2

Supplemental Tables

DOI: 10.7717/peerj.13404/supp-3

Additional Information and Declarations

Competing Interests

The authors declare there are no competing interests.

Author Contributions

Yunmeng Chu conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Shun Guo conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Dachao Cui performed the experiments, prepared figures and/or tables, and approved the final draft.

Xiongfei Fu conceived and designed the experiments, authored or reviewed drafts of the article, and approved the final draft.

Yingfei Ma conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The code and data are available at GitHub: https://github.com/chuym726/DeephageTP.

Funding

This work was supported by the Ministry of Science and Technology of China (http://www.most.gov.cn, grant nos. 2018YFA0903100). This work was also supported by the grant from the Guangdong Provincial Key Laboratory of Synthetic Genomics (2019B030301006), the Shenzhen Key Laboratory of Synthetic Genomics (ZDSYS201802061806209), and the Shenzhen Peacock Team Project (KQTD2016112915000294). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

5 Citations 1,570 Views 144 Downloads