Phylogenomic analysis of 589 metagenome-assembled genomes encompassing all major prokaryotic lineages from the gut of higher termites
A peer-reviewed article of this Preprint also exists.
Author and article information
Abstract
“Higher” termites have been able to colonize all tropical and subtropical regions because of their ability to digest lignocellulose with the aid of their prokaryotic gut microbiota. Over the last decade, numerous studies based on 16S rRNA gene amplicon libraries have largely described both the taxonomy and structure of the prokaryotic communities associated with termite guts. Host diet and microenvironmental conditions have emerged as the main factors structuring the microbial assemblages in the different gut compartments. Additionally, these molecular inventories have revealed the existence of termite-specific clusters that indicate coevolutionary processes in numerous prokaryotic lineages. However, for lack of representative isolates, the functional role of most lineages remains unclear. We reconstructed 589 metagenome-assembled genomes (MAGs) from the different gut compartments of eight higher termite species that encompass 17 prokaryotic phyla. By iteratively building genome trees for each clade, we significantly improved the initial automated assignment, frequently up to the genus level. We recovered MAGs from most of the termite-specific clusters in the radiation of, e.g., Planctomycetes, Fibrobacteres, Bacteroidetes, Euryarchaeota, Bathyarchaeota, Spirochaetes, Saccharibacteria, and Firmicutes, which to date contained only few or no representative genomes. Moreover, the MAGs included abundant members of the termite gut microbiota. This dataset represents the largest genomic resource for arthropod-associated microorganisms available to date and contributes substantially to populating the tree of life. More importantly, it provides a backbone for studying the metabolic potential of the termite gut microbiota, including the key members involved in carbon and nitrogen biogeochemical cycles, and important clues that may help cultivating representatives of these understudied clades.
Cite this as
2019. Phylogenomic analysis of 589 metagenome-assembled genomes encompassing all major prokaryotic lineages from the gut of higher termites. PeerJ Preprints 7:e27929v1 https://doi.org/10.7287/peerj.preprints.27929v1Author comment
This is a submission to PeerJ for review.
Sections
Supplemental Information
Phylogenomic distribution of the MAGs according to the host diet
The outer rings show the occurrence of MAGs in termites with different diets. The maximum likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I model of amino-acid evolution.
Phylogenomic distribution of the MAGs according to the gut compartment of the host
The outer rings show the occurrence of MAGs in the different termite gut compartments: C crop (foregut), M midgut, P1–P5 proctodeal compartments (hindgut). The maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I model of amino-acid evolution.
Phylogenomic tree of the Archaea
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Asgard group was used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Ruminococcaceae family (Firmicutes)
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Dorea and Butyrivibrio (Lachnospiraceae) species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Actinobacteria
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Chloroflexi species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Spirochaetes
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Elusimicrobia and Cyanobacteria were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Fibrobacteres
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Bacteroidetes were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Desulfovibrionaceae family (Deltaproteobacteria)
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Desulfonatronum species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Bacteroidetes
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Chlorobi species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Chloroflexi, Saccharibacteria and Microgenomates
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Actinobacteria species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Synergistetes
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Elusimicrobia species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Planctomycetes
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Verrucomicrobia species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Elusimicrobia
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Spirochaetes species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Cloacimonetes
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Fibrobacteres species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Kiritimatiellaeota
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Chlamydiae species were used as outgroup. Names in bold included MAGs recovered in the present study.
Phylogenomic tree of the Acidobacteria
This maximum-likelihood tree was inferred from a concatenated alignment of 43 proteins using the LG+G+I+F model of amino-acid evolution. Branch supports were calculated using a Chi2-based parametric approximate likelihood-ratio test. Proteobacteria species were used as outgroup. Names in bold included MAGs recovered in the present study.
Final taxonomic assignment and characteristics of the MAGs
Additional Information
Competing Interests
The authors declare that they have no competing interests.
Author Contributions
Vincent Hervé conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
Pengfei Liu performed the experiments, analyzed the data, approved the final draft.
Carsten Dietrich conceived and designed the experiments, performed the experiments, approved the final draft.
David Sillam-Dussès contributed reagents/materials/analysis tools, approved the final draft.
Petr Stiblik contributed reagents/materials/analysis tools, approved the final draft.
Jan Šobotník contributed reagents/materials/analysis tools, approved the final draft.
Andreas Brune conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
Field Study Permissions
The following information was supplied relating to field study approvals (i.e., approving body and any reference numbers):
Field experiments were approved by the French Ministry for the Ecological and Solidarity Transition (UID: ABSCH-CNA-FR-240495-2).
DNA Deposition
The following information was supplied regarding the deposition of DNA sequences:
The data have been deposited at GenBank under the BioProject accession number PRJNA560329; genomes are available with accession numbers SRR9983610-SRR9984198.
Data Deposition
The following information was supplied regarding data availability:
The accession numbers of the MAGs are provided in the Supplementary Table 2.
The data have been deposited at GenBank under the BioProject accession number PRJNA560329; genomes are available with accession numbers SRR9983610-SRR9984198.
Funding
This study was funded by the Deutsche Forschungsgemeinschaft in the collaborative research center SFB 987 (Microbial Diversity in Environmental Signal Response) and by the Max-Planck-Gesellschaft. The work conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. PS and JŠ were supported by grant "EVA4.0", No. CZ.02.1.01/0.0/0.0/16_019/0000803 financed by OP RDE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.