A genomic hotspot of diversifying selection and structural change in the hoary bat (Lasiurus cinereus)

View article
Bioinformatics and Genomics

Main article text

 

Introduction

Materials and Methods

Results and discussion

Structural evolution of the CE and NF synteny blocks in bats and tetrapod outgroups

Functional roles of positively selected genes and synteny blocks

Birth and death of a Tbx1-like gene family in Vespertilionidae

Caveats and conclusions

Supplemental Information

Phylogenetic topologies used as guide trees for evolutionary analysis.

Tree 1 is consistent with Amador et al. (2018). Tree 2 differs from tree 1 only in postulating that Lasiurus is closer to Pipistrellus than the latter is to Eptesicus. Tree 3 is consistent with Agnarsson et al. (2011). Tree 4 is the same as Tree 1 except that four taxa with skewed nucleotide composition have been pruned (see text for details). <v:shape id="Picture_x0020_15" o:spid="_x0000_i1025" type="#_x0000_t75" alt="Diagram Description automatically generated"> <v:imagedata src="file:///C:/Users/rcornman/AppData/Local/Temp/1/msohtmlclip1/01/clip_image001.png" o:title="Diagram Description automatically generated">

DOI: 10.7717/peerj.17482/supp-1

Comparison of scaffold length and coverage patterns to the karyotype of Lasiurus cinereus.

A) Scaffold length, relative coverage in a population data set, and coefficient of variation (CV) of coverage among the 23 individual L. cinereus samples in that data set, see text for details. The inferred chromosome type is as inferred by the author using the terminology of Bickham (1987) and based on a comparison between panel A and panel B. The rows are color-coded by chromosome type in each panel. B) Karyotype of L. cinereus with colored boxes added by the author to represent the distinct chromosome types postulated in panel A and discussed in Bickham (1987). Karyotype image reproduced by permission from Bickham (1987) (©: Oxford University Press). C) Variation in relative coverage of chromosomes in mapped reads from the population data set, with the accession for each individual sample listed on the horizontal axis. The lines highlight trends among samples and chromosomes, using four line colors that correspond to the chromosome types illustrated in the preceding panels. Thus, gray lines represent relative coverage for large metacentric chromosomes, blue lines represent coverage for medium metacentric chromosomes, green lines represent coverage for short acrocentric chromosomes, and the red line represent coverage for the X chromosome.

DOI: 10.7717/peerj.17482/supp-2

Trpc4 gene structures in bats and carnivores do not suggest annotation errors or divergence in exon number that could lead to paralogous codon alignments and thus invalid estimates of evolutionary rate.

A. Screen captures of annotated Trpc4 transcripts in representative carnivore and bat genomes showing similar gene architectures. Images are from the Genome Data Viewer webtool of the National Center for Biotechnology Information (NCBI). B. Screen capture of a TRPC4 protein isoform from cat aligned to a representative bat genome. Aligned segments of the protein query match only annotated exons in the subject genome.

DOI: 10.7717/peerj.17482/supp-3

A strong shift in background nucleotide composition occurs in four of the thirteen bat taxa analyzed.

A) Percentage of coding sequences that is G or C for analyzed genes. Genes are sorted by genomic order in Lasiurus cinereus (genes absent in this species are not shown). Species are colored by taxonomic clade as in Fig. 7. Phyllostomid species have higher GC content across the entire region, whereas higher GC content in Molossus is largely restricted to genes of the NF block (see text for details). B) GC content in representative primates and rodents, which show much less variation among taxa.

DOI: 10.7717/peerj.17482/supp-4

The positive selection candidate Amer3 is part of a block of eight genes that was rearranged in the ancestor of bats and has maintained tight linkage to the “CE” gene block through subsequent bat evolution.

A. A table of the order (numbers) and orientation (plus or minus symbols) of eight landmark genes in outgroup tetrapods and in representative bat species. Green cells indicate the genes remain tightly linked to Amer3 in that taxon, yellow cells indicate the genes are present in the genome but not linked to Amer3, and red cells indicate the genes were not found in that species. Dark green cells indicate the orientation of the gene differs from the presumed ancestral state. The black box around Amer3 in Lasiurus cinereus denotes that the positive selection test was significant. B. A schematic of the relative positions of gene blocks on linkage groups of the genome assemblies of human and five representative bat species. Gene symbols for landmark genes based on human nomenclature define each block as shown in the legend. See text for details of how gene blocks are defined.

DOI: 10.7717/peerj.17482/supp-5

Schematic of the organization of Fgf9 and Sacs genes in bats and other tetrapods.

Each ideogram displays a linear arrangement of landmark genes identified by gene symbol. Gene orientation is indicated by a plus or minus symbol, whereas numbers indicate the gene order in human to aid the visualization of rearrangements. Genes colored purple are those that remain tightly linked to Fgf9 whereas genes colored blue remain tightly linked to Sacs. The first three genes of the CE (orange) and NF (green) blocks, as defined in the text, are shown when in proximity to the Sacs or Fgf9 genes. Genes colored white are not consistently linked to any block but are useful for identifying additional structural rearrangements in the regions. Greyed genes were not found in a given species, whereas gray-colored gaps of the specified size in megabases (Mb) indicate a large span of intervening genomic sequence. Accession numbers for each linkage group are shown above each box, with multiple boxes indicating genes that are on different linkage groups. A. Representative tetrapod gene organizations. B Representative gene organizations in bats.

DOI: 10.7717/peerj.17482/supp-6

Organization of conserved landmark genes in the vicinity of genes selected in Lasiurus cinereus, in bats and other tetrapod groups.

Genes are organized into five color-coded multi-gene blocks, defined as described in the text and numbered according to their order in human. Arrowheads indicate relative orientation on plus or minus strands of each chromosome. Genes on the same linkage group are boxed within each species. Intergenic distances are not to scale and other genes that may be present in the region in a given species are not shown. Genes listed in the legend that are absent in any give taxon are either lost or located elsewhere in the genome, see text for details. A. Gene organization in nine bats. The twelve positive-selection candidates in L. cinereus have a darker outline. B. Organization of the same landmark genes in other tetrapods, indicating a much slower pace of structural evolution despite the greater evolutionary divergence time.

DOI: 10.7717/peerj.17482/supp-7

Properties of Tbx1-like genes.

A) Comparative secondary structure of the human TBX1 protein and the translation of an example Tbx1-like gene annotated by Cornman & Cryan (2022). Beta sheet motifs are represented by green arrows and alpha helix motifs are represented by red arrows. Unstructured amino-acid sequence at the C-terminus of each protein has been trimmed. Yellow boxes indicate residues predicted to interact with DNA targets (see text for details). The arginine residue (“R”) at position 137 is predicted to contact the major groove of the DNA helix. B) Alignments of a putative Tbx1-like homolog in Pipistrellus kuhlii to the Pipistrellus genome, as shown in the Genome Data Viewer of the National Center for Biotechnology Information. The scaffold and position of each match is shown. The bottom track is public RNA-Seq coverage data, showing RNA alignments overlapping the TBLASTN matches. C) A protein-level alignment of the coding sequence annotated in File S5 and the closest BLASTX matches in three other Vespertilionidae. Incomplete or frame-shifted codons are translated as “X”, whereas internal stop codons are translated as “*”. Residues using the same font color have similar biochemical properties, following the scheme used in BioEdit. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

DOI: 10.7717/peerj.17482/supp-8

Sequence alignments and trees used in tests of diversifying selection.

Sequence alignments are in FASTA format and guide trees are in newick format. Sequences are labeled with genus names only for consistency and software compatibility.

DOI: 10.7717/peerj.17482/supp-9

Output and P-values for tests of diversifying selection described in the text.

Program outputs are shaded in blue. P-values less than 0.01 are bolded. The first column indicates whether the gene was found to have significant evidence of diversifying selection in previous work, as discussed in the text. A separate sheet of output is given for each tree analyzed (see text for details). The final sheet gives the tree and PAML output for the Trpc4 gene comparison between bats and carnivores.

DOI: 10.7717/peerj.17482/supp-10

Locations of genes and gene blocks discussed in the text for outgroup and bat species.

Other bat species used in evolutionary rate analysis were not included in this synteny comparison if the orthologous genes were not located on large linkage groups. Note that Mab21l1 is nested within an intron of Nbea and thus are both labeled as position “1” in the column “Human order”. The strand and the order in human are listed to aid the detection of structural changes.

DOI: 10.7717/peerj.17482/supp-11

Standard BLAST text output for searches of Tbx1-like sequence XP_027987819.1 against reference genomes of Pipistrellus kuhlii and Myotis myotis.

The two outputs are concatenated here, each beginning with the line "Job Title".

DOI: 10.7717/peerj.17482/supp-12

Annotation of standard gene features associated with a Tbx1-like sequence in the Pipistrellus kuhlii reference genome sequence.

The predicted core promoter TATA box is highlighted in blue. The inferred transcription start site is underlined based on the standard -25 offset of the TATA box. The first transcribed start codon is highlighted in green. Two exons and a proposed intron are shown, the intron was inferred by alignment of the conceptual translation to other Tbx1-like predictions in Lasiurus cinereus and Eptesicus fuscus. A canonical polyadenylation signal is highlighted in yellow. See text for details.

DOI: 10.7717/peerj.17482/supp-13

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Robert S. Cornman conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

DNAZoo genome assembly of the hoary bat, available at https://www.dnazoo.org/assemblies/aeorestes_cinereus

The short read data, PRJNA559902, were used to further evaluate the genome assembly.

Funding

The author received no external funding for this work. The work was supported by internal funds of the U.S. Geological Survey. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

736 Visitors 797 Views 34 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more