Description of a new member of the family Erysipelotrichaceae: Dakotella fusiforme gen. nov., sp. nov., isolated from healthy human feces

A Gram-positive, non-motile, rod-shaped facultative anaerobic bacterial strain SG502T was isolated from healthy human fecal samples in Brookings, SD, USA. The comparison of the 16S rRNA gene placed the strain within the family Erysipelotrichaceae. Within this family, Clostridium innocuum ATCC 14501T, Longicatena caecimuris strain PG-426-CC-2, Eubacterium dolichum DSM 3991T and E. tortuosum DSM 3987T(=ATCC 25548T) were its closest taxa with 95.28%, 94.17%, 93.25%, and 92.75% 16S rRNA sequence identities respectively. The strain SG502T placed itself close to C. innocuum in the 16S rRNA phylogeny. The members of genus Clostridium within family Erysipelotrichaceae was proposed to be reassigned to genus Erysipelatoclostridium to resolve the misclassification of genus Clostridium. Therefore, C. innocuum was also classified into this genus temporarily with the need to reclassify it in the future because of its difference in genomic properties. Similarly, genome sequencing of the strain and comparison with its 16S phylogenetic members and proposed members of the genus Erysipelatoclostridium, SG502T warranted a separate genus even though its 16S rRNA similarity was >95% when comapred to C. innocuum. The strain was 71.8% similar at ANI, 19.8% [17.4–22.2%] at dDDH and 69.65% similar at AAI to its closest neighbor C. innocuum. The genome size was nearly 2,683,792 bp with 32.88 mol% G+C content, which is about half the size of C. innocuum genome and the G+C content revealed 10 mol% difference. Phenotypically, the optimal growth temperature and pH for the strain SG502T were 37 °C and 7.0 respectively. Acetate was the major short-chain fatty acid product of the strain when grown in BHI-M medium. The major cellular fatty acids produced were C18:1ω9c, C18:0and C16:0. Thus, based on the polyphasic analysis, for the type strain SG502T (=DSM 107282T= CCOS 1889T), the name Dakotella fusiforme gen. nov., sp. nov., is proposed.


INTRODUCTION
The members of family Erysipelotrichaceae have been isolated from the intestinal tracts of mammals (Alcaide et al., 2012;Greiner & Backhed, 2011;Han et al., 2011) and insects (Egert et al., 2003) and are associated with host metabolism and inflammatory diseases

Bacterial isolation and culture condition
The strain was isolated from healthy human fecal sample as part of a culturomics study. The collection of the human fecal samples were done with the approval of the Institutional review board (approval #IRB-1709018-EXP) at South Dakota State University, Brookings, SD, USA. The fecal samples were collected after receiving the informed consent form the donors. After transferring the fresh fecal samples into the anaerobic chamber (85% nitrogen, 10% hydrogen and 5% carbon dioxide) within 10 min of voiding, the sample was diluted 10 times with anaerobic PBS and stored with 18% DMSO in -80 • C. The sample was cultured in modified BHI medium (BHI-M) containing 37g/L of BHI, 5 g/L of yeast extract, 1 ml of 1 mg/mL menadione, 0.3 g of L-cysteine, one mL of 0.25 mg/L of resazurin, one mL of 0.5 mg/mL hemin, 10 mL of vitamin and mineral mixture,1.7 mL of 30 mM acetic acid, two mL of 8 mM propionic acid, two mL of 4 mM butyric acid, 100 µl of 1 mM isovaleric acid, and 1% pectin and inulin. After isolation, the strain was subjected to MALDI-ToF (Bruker, Germany). Since MALDI-ToF did not identify a species, 16S rRNA gene sequencing was performed for species identification.

Phenotypic and chemotaxonomic characterization
For morphological, physiological and biochemical characterization, the strain was cultivated in BHI-M medium in anaerobic conditions at 37 • C at pH 6.8 ± 0.2. Colony characteristics were determined after streaking the strain on BHI-M agar plates followed by 48 h of anaerobic incubation. Gram staining was performed using a Gram staining kit (BD Difco) according to the manufacturer's protocol. During the exponential growth of the bacterium, cell morphology and flagellation was examined under scanning electron microscopy (SEM). SG502 T was grown separately in aerobic and anaerobic conditions to determine the aerotolerance. Further, the strain was grown at 4, 20, 30, 40 and 55 • C to determine the range of growth under anaerobic conditions. The BHI-M media was adjusted to pH levels between 4 and 9 with 0.1N HCl and 0.1N NaOH to determine the growth of the strain at different pH levels. BHI-M medium was supplemented with triphenyltetrazolium chloride (TTC) (Shields & Cathcart, 2011) to determine the motility of the strain.
The phenotypic and biochemical characterizations were performed using AN MicroPlate (Biolog) and API ZYM (bioMerieux) according to the manufacturer's instructions. Also, after growing the strain SG502 T and ATCC 14501 T in BHI-M medium at 37 • C for 24 h, cells were harvested for cellular fatty acid analysis. Fatty acids were extracted, purified, methylated, identified and analyzed using GC (Agilent 7890A) according to manufacturer's instructions (MIDI) (Sasser, 1990). Further, short-chain fatty acid (SCFA) production was determined using gas chromatography after cells were grown in BHI-M medium. For SCFA estimation, 800 µl of the bacterial culture was collected and 160 µl of freshly prepared 25% meta-phosphoric acid (w/v) was added before freezing to −80 • C. The sample were thawed and centrifuged at >20, 000×g for 30 min before injecting 600 µl of the supernatant into the TRACE1310 GC system (ThermoScientific, Waltham, MA, USA).

Phylogenetic analysis
Genomics DNA from the strain was isolated using E.Z.N.A bacterial DNA isolation kit (Omega Biotek) following the manufacturer's instructions. The 16S rRNA gene was amplified using universal primer set 27F (5 -AGAGTTTGATCMTGGCTCAG-3 ) and 1492R (5 -ACCTTGTTACGACTT-3 ) and sequenced using a Sanger sequencing chemistry (ABI 3730XL; Applied Biosystems). The sequences were assembled using Genious 10.2.3. The nearly complete 16S rRNA gene sequence obtained was used for a similarity search in EzTaxon-e program (http://www.ezbiocloud.net/) for the valid taxonomic names. The bacterial species that closely resembled the query sequences were then used for alignment and phylogenetic analysis in MEGAX software (Kumar et al., 2018). Initially, the sequences were aligned using MUSCLE (Edgar, 2004) and the Neighbor Joining method (Saitou & Nei, 1987) was used to reconstruct the phylogenetic tree employing Kimura 2-parameter model (Kimura, 1980) with 1000 bootstraps. Phylogenetic trees were also constructed using maximum-likelihood (Felsenstein, 1981)and minimum evolution methods (Rzhetsky & M, 1992). Clostridium butyricum ATCC 19398 T was used as an out-group.

Genomic features and comparison
For the whole genome sequencing of SG502 T , we used 0.3ng of the genomic DNA for library preparation. Library was sequenced on an Illumina MiSeq using 2x 250 paired-end V2 chemistry. Genome was assembled from raw fastq files using Unicycler which builds an initial assembly graph from short reads using the de novo assembler SPAdes3.11.1 (Bankevich et al., 2012). Quality assessment for the assemblies was performed using QUAST (Gurevich et al., 2013). Genome annotation was performed using Prokka 1.13 (Seemann, 2014). The genome of SG502 T was visualized using DNAplotter (Carver et al., 2009).

RESULTS
The SG502 T strain was isolated from the healthy human fecal sample during the culturomics study of the human gut microbiota. The colonies of the strain appeared white, smooth and convex with entire edges. The cells were initially subjected to MALDI-ToF MS (Fig. 1A) which revealed the score <1.70 suggesting no identification. Thus, further phenotypic characterization and genetic based methods were employed for identification of the strain.
Morphologically, individual cells of the strain appeared to be gram-positive rods. The cell was observed to be slender with tapering ends with 1.5×0.35 µm in dimensions ( Fig. 1B and Table 1) under SEM. No flagella were observed under SEM suggesting its non-motile nature which was also validated by TTC assay. The strain also lacked endospores, similar to what has been previously reported for the members of Erysipelotrichaceae (Verbarg et al., 2004). The strain grew in a pH range of 6.0-7.5 with optimal growth at pH 7.0. It could grow anaerobically over the temperature range of 25-45 • C with optimal growth at 37 • C. The strain grew well in BHI-M under anaerobic conditions but under aerobic conditions, the growth was comparatively lower and slow confirming that the strain was a facultative anaerobe. Based on the results obtained from a carbon source utilization test (Biolog AN plate), the strain utilizes glucose, sorbitol, maltose, arbutin, D-fructose, L-fucose, palatinose, dextrin, turanose, D-trehalose, L-rhamnose, uridine, pyruvic acid methyl ester, pyruvic acid, 3-methyl-D-glucose, gentiobiose, maltotriose, ducitol, L-phenylalanine, α-ketovaleric acid, N-acetyl-D-glucosamine, N-acetyl-β-D-mannosamine, cellobiose, α-ketobutyric acid, D-galacturonic acid and N-acetyl-D-glucosamine. Also, SG502 T assimilated sorbitol and maltose which were not utilized by its closest neighbor C. innocuum ATCC 14501 T . Furthermore, SG502 T was unable to utilize sucrose, salicin, mannitol, lactose, and raffinose when compared to C. innocuum. Positive enzymatic activities for leucine arylamidase, cystine arylamidase, α-chymotripsin and acid phosphates were observed for C. innocuum differentiating it from SG502 T . Detailed phenotypic and biochemical characteristics of the strain are presented in Table 1. Also, the major fatty acids content identified were C 18:1 ω9c (29.82%), C 18:0 (22.55%) and C16:0 (14.7%) compared to C. innocuum ATCC 14501 T with C 18:1 ω9c (14.64%), C 18:0 (10.56%) and C16:0 (23.7%) ( Table 2). The detailed comparison of the fatty acids in SG502 T along with C. innocuum 14501 T andE. dolichum DSM 3991 T is given in Table 2. Additionally, the major SCFAs metabolite identified for SG502 T was acetate in BHI-M medium. Low but detectable amounts of propionate and butyrate were produced by the strain SG502 T . The utilization of such broad substrates and production of SCFAs can be ecologically effective trait against pathogen colonization in the gut. As the strain was not identified using MALDI-ToF, 16S rRNA sequence was amplified to obtain a continuous stretch of 1338 bp gene which was searched against the Eztaxon 16S rRNA gene database for identification. The closest species identified were all from the Erysipelotrichaceae family that included C. innocuum ATCC 14501 T , L. caecimuris strain PG-426-CC-2, E. dolichum DSM 3991 T and E. tortuosum ATCC 25548 T with 95.28%, 94.17%, 93.25%, and 92.75% sequence identities respectively. Currently, the cut off for the species and genus level classification of the bacteria based on 16S rRNA gene is <98.7% (E Motility ---

Carbon sources utilization
Glucose

Enzyme activity (API ZYM)
Alkaline phosphatase  & J, 2006) and <94.5% identity (Yarza et al., 2014) respectively. Thus, the strain SG502 T and C. innocuum were suggested to fall within same genus but different species. The phylogenetic analysis also revealed that the isolate belonged to Erysipelotrichaceae family where the strain SG502 T was closely associated to C. innoccum ATCC 14501 T but further from L. caecimuris strain PG-426-CC-2, E. dolichum DSM 3991 T and E. tortuosum ATCC 25548 T which altogether formed a larger clade (Fig. 2). The separation of these four species from the strain SG502 T did not depend on the phylogenetic algorithm and was supported by an 100% bootstrap value. To further differentiate the strain, we sequenced the whole genome of the strain and is visualized in Fig. 3. The draft genome of the strain SG502 T was 2,683,792 bp with 32.88 mol% G+C content. The largest contig was of 154,144 bp and N 50 was 52,214. The total number of predicted coding sequences, tRNAs, rRNAs, and tmRNAs was 2654, 49, 2 and 1 respectively. C. innocuum was the nearest neighbor of SG502 T based on 16S rRNA phylogeny. C. innocuum along with C. cocleatum, C. saccharogumia, C. ramosum, and C. spiroforme were suggested to be reclassified previously into genus Erysipelatoclostridium with C. innocuum needing further reclassification (Yutin & Galperin, 2013). Therefore, we checked for the 16S identity of SG502 T with the other members of this proposed genera C. cocleatum, C. saccharogumia, C. ramosum, and C. spiroforme in NCBI. These species were found to

Figure 2 Neighbor Joining tree of 16S rRNA gene sequences of SG502 T with related species under
Erysipelotrichaceae family. GenBank accession numbers of the 16S rRNA gene sequences are given in parentheses. The sequences were aligned using MUSCLE (Edgar, 2004) and the evolutionary distances were computed using Kimura 2-parameter method to obtain the phylogenetic tree in MEGAX (Kumar et al. 2018) after 1,000 bootstrap tests (shown as percentages with associated taxa clustered together next to the branches. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Bar, 0.05 substitutions per nucleotide position. Clostridium butyricum ATCC 19398 T was used as an outgroup. Full-size DOI: 10.7717/peerj.10071/ fig-2 be 84.45%, 84.49%, 85.10% and 85.014% identical respectively which demonstrated that SG502 T should not be placed into same genera with these species. We also compared the genomic properties of the strain with its 16S rRNA based phylogenetic neighbors along with C. innocuum and the members of formerly proposed genus Erysipelatoclostridium, C. cocleatum, C. saccharogumia, C. ramosum, and C. spiroforme. The genomic sizes and G+C content of the members of the neighbors of the strain were found to vary as shown in Table 3. C. innocuum was 4,772,018 bp in length with 43.4 mol% G+C content, while for SG502 T , genome length was 2, 683,792 bp with 32.88 mol% G+C content. The genome sizes and G+C content of the neighboring species were highly variable compared to SG502 T (Table 3). Because of such high differences in the genome properties of SG502 T , we performed further comparison for classifying SG502 T as a novel genus. Hence, the genome of the strain was compared to its neighbors using OrthoANI as shown in Fig. 4. The strain SG502 T was 71.8% similar to its nearest neighbor C. innocuum and had lower similarities with other neighbors. (Figs. 4A, 4B). The proposed cut-off for OrthoANI for a new species is 95-96% (Kim et al., 2014;Lee et al., 2016). The dDDH was only 19.8% between SG502 T and C. innocuum (Table 4). One of the major methods to demarcate the genus is to calculate average amino acid identity (AAI) between the genomes with the  Tiedje, 2007). The strain SG502 T showed highest AAI with C. innocuum (69.65%) followed by L. caecimuris (63.45%) and E. dolichum (63.02%) (Fig. 5) supporting the designation of strain SG502 T in a novel genus.

DISCUSSION
Recently, next generation sequencing and high-throughput culturing methods has been employed for large scale culture of the unknown gut microbiota. This new approach termed as ''culturomics'' has evolved as a tool to culture previously uncultured bacteria (Browne et al., 2016;Lagier et al., 2016). However, such culture independent studies have also highlighted that the diverse population of gut bacteria are yet to be cultivated (Almeida et al., 2019;Lagier et al., 2012). The pure culture of the bacteria is essential to elucidate the role of these organisms in health and diseases for both experimental model and therapeutics purposes (Daillere et al., 2016;Kobyliak et al., 2016;Vetizou et al., 2015). In this study, we report the culturing and characterization of a previously uncultured bacterium SG502 T  from the healthy human fecal samples that belongs to a new genus and species. Also, we employed taxono-genomics approach (Fournier & Drancourt, 2015) to determine the phenotypic and genetic properties of the taxon. 16S rRNA based gene sequence homology is the widely used method to determine the novelty of the prokaryotic organism with varying threshold values at distinct taxonomic levels (Clarridge 3rd, 2004;Kim et al., 2014) . Therefore, we performed the 16SrRNA based phylogenetic analysis of the strain SG502 T which showed it as a member of Erysipelotrichaceae family. Under this family, it clustered together with Clostridium innoccum, Longicatena caecimuris, Eubacterium dolichum and Eubacterium tortuosum with C. innocuum as a closest member. C. innocuum along with other members of misclassified Clostridia under Erysipelotrichaceae family were proposed to be reclassified into gen. nov. Erysipeloclostridium. The members of this proposed genus Erysipelatoclostridium are gram positive, nonmotile, obligately anaerobic straight or helically curved rods which rarely forms spores. The G+C content is lower and varies from 27-33 mol% (Yutin & Galperin, 2013). However, C. innocuum was identified to be a distantly related member of Erysipelatoclostridium with higher G+C content of 43-44% with need of reclassification (Yutin & Galperin, 2013). In this context, we also searched for the 16S based identity of the strain SG502 T with the proposed members of genus Erysipelatoclostridium. Nevertheless, the proposed members of genus Erysipelatoclostridium were <86% similar at 16S sequence level, suggesting the uniqueness of SG502 T . Phenotypically, SG502 T revealed several differences in carbon sources utilization, enzymatic activity and fatty acid when compared to its phylogenetic neighbors (Tables 1  and 2). In addition, whole genome sequence comparison revealed its distinctiveness with respect to 16S phylogenetic members and the proposed members of Erysipelatoclostridium genus (Tables 3 and 4). Furthermore, OrthoANI based genomic comparison with 16S phylogenetic neighbors showed as high as 71.54% similarity with L. caecimuris DSM 29481 T . Also, the genome of SG502 T was only 82.21% similar with C. cocleatum DSM 1551 T which is a member of Erysipelatoclostridium genus (Fig. 4). Major differences were evident in dDDH and amino acid composition comparison as well (Table 4, Fig. 5). Finally, the genome size of the nearest neighbor C. innocuum was nearly twice that of SG502 T and the difference of G+C content was comparatively high (>10 mol%) suggesting that the strain is not close to C. innocuum genetically which means that SG502 T require the placement in a separate genus.

CONCLUSION
Despite 95.15% 16S rRNA similarity of SG502 T with its nearest neighbor C. innocuum, the differences in its physiological, biochemical, and whole genome sequence suggest its placement in a novel genus. The cells of the bacterium are anaerobic, gram-positive non-motile rods. The average size of the cell is 1.5×0.35 µm. Bacterial colonies on BHI-M agar are white, convex and entire approximately 0.1 cm in diameter. The optimum temperature and pH for the anaerobic growth are 37 • C and 7.0 respectively. The strain SG502 T utilizes glucose, sorbitol, maltose, arbutin, D-fructose, L-fucose, palatinose, dextrin, turanose, D-trehalose, L-rhamnose, uridine, pyruvic acid methyl ester, pyruvic acid, 3-methyl-Dglucose, gentiobiose, maltotriose, ducitol, L-phenylalanine, a-ketovaleric acid, N-acetyl-Dglucosamine, N-acetyl-b-D-mannosamine, cellobiose, a-ketobutyric acid, D-galacturonic acid and N-acetyl-D-glucosamine. Positive enzymatic reactions were observed for alkaline phosphatase only. The primary short-chain fatty acid produced by the strain is acetate while small amounts of propionate and butyrate were also noted. The major cellular fatty acids of the strain SG502 T are C 18:1 ω9c, C 18:0 andC 16:0 .. The type strain, SG502 T (=DSM 107282 T =CCOS 1889 T ), was isolated from a healthy human fecal sample. The genomic size of the strain is 2, 683,792 bp and G+C content of the strain SG502 T is 32.88 mol%.

PROTOLOGUE
The GenBank accession number for the 16S rRNA gene sequence of the strain SG502 T is MN266902. The GenBank BioProject ID number for the draft genome sequence of the strain SG502 T is PRJNA494608 .

MALDI-ToF
Matrix Assisted Laser Desorption/Ionization-Time of Flight ANI Average Nucleotide Identity