title: PeerJ description: Articles published in PeerJ link: https://peerj.com/articles/index.rss3?journal=peerj&page=914 creator: info@peerj.com PeerJ errorsTo: info@peerj.com PeerJ language: en title: First complete mitogenomes of Diamesinae, Orthocladiinae, Prodiamesinae, Tanypodinae (Diptera: Chironomidae) and their implication in phylogenetics link: https://peerj.com/articles/11294 last-modified: 2021-05-06 description: BackgroundThe mitochondrial genome (mitogenome) has been extensively used for phylogenetic and evolutionary analysis in Diptera, but the study of mitogenome is still scarce in the family Chironomidae.MethodsHere, the first complete mitochondrial genomes of four Chironomid species representing Diamesinae, Orthocladiinae, Prodiamesinae and Tanypodinae are presented. Coupled with published mitogenomes of two, a comparative mitochondrial genomic analysis between six subfamilies of Chironomidae was carried out.ResultsMitogenomes of Chironomidae are conserved in structure, each contains 37 typical genes and a control region, and all genes arrange the same gene order as the ancestral insect mitogenome. Nucleotide composition is highly biased, the control region displayed the highest A + T content. All protein coding genes are under purifying selection, and the ATP8 evolves at the fastest rate. In addition, the phylogenetic analysis covering six subfamilies within Chironomidae was conducted. The monophyly of Chironomidae is strongly supported. However, the topology of six subfamilies based on mitogenomes in this study is inconsistent with previous morphological and molecular studies. This may be due to the high mutation rate of the mitochondrial genetic markers within Chironomidae. Our results indicate that mitogenomes showed poor signals in phylogenetic reconstructions at the subfamily level of Chironomidae. creator: Chen-Guang Zheng creator: Xiu-Xiu Zhu creator: Li-Ping Yan creator: Yuan Yao creator: Wen-Jun Bu creator: Xin-Hua Wang creator: Xiao-Long Lin uri: https://doi.org/10.7717/peerj.11294 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Zheng et al. title: Seagrass structural and elemental indicators reveal high nutrient availability within a tropical lagoon in Panama link: https://peerj.com/articles/11308 last-modified: 2021-05-06 description: Seagrass meadows are valued coastal habitats that provide ecological and economic benefits around the world. Despite their importance, many meadows are in decline, driven by a variety of anthropogenic impacts. While these declines have been well documented in some regions, other locations (particularly within the tropics) lack long-term monitoring programs needed to resolve seagrass trends over time. Effective and spatially-expansive monitoring within under-represented regions is critical to provide an accurate perspective on seagrass status and trends. We present a comprehensive dataset on seagrass coverage and composition across 24 sites in Bahía Almirante, a lagoon along the Caribbean coast of Panama. Using a single survey, we focus on capturing spatial variation in seagrass physical and elemental characteristics and provide data on key seagrass bio-indicators, such as leaf morphology (length and width), elemental content (% nitrogen and phosphorus) and stable isotopic signatures (δ13C and δ15N). We further explore relationships between these variables and water depth (proxy for light availability) and proximity to shore (proxy for terrestrial inputs). The seagrass assemblage was mostly monospecific (dominated by Thalassia testudinum) and restricted to shallow water (<3 m). Above-ground biomass varied widely, averaging 71.7 g dry mass m−2, yet ranging from 24.8 to 139.6 g dry mass m−2. Leaf nitrogen content averaged 2.2%, ranging from 1.76 to 2.57%, while phosphorus content averaged 0.19% and ranged from 0.15 to 0.23%. These values were high compared to other published reports for T. testudinum, indicating elevated nutrient availability within the lagoon. Seagrass stable isotopic characteristics varied slightly and were comparable with other published values. Leaf carbon signatures (δ13C) ranged from −11.74 to −6.70‰ and were positively correlated to shoreline proximity, suggesting a contribution of terrestrial carbon to seagrass biomass. Leaf nitrogen signatures (δ15N) ranged from −1.75 to 3.15‰ and showed no correlation with shoreline proximity, suggesting that N sources within the bay were not dominated by localized point-source discharge of treated sewage. Correlations between other seagrass bio-indicators and environmental metrics were mixed: seagrass cover declined with depth, while biomass was negatively correlated with N, indicating that light and nutrient availability may jointly regulate seagrass cover and biomass. Our work documents the response of seagrass in Bahía Almirante to light and nutrient availability and highlights the eutrophic status of this bay. Using the broad spatial coverage of our survey as a baseline, we suggest the future implementation of a continuous and spatially expansive seagrass monitoring program within this region to assess the health of these important systems subject to global and local stressors. creator: Julie Gaubert-Boussarie creator: Andrew H. Altieri creator: J. Emmett Duffy creator: Justin E. Campbell uri: https://doi.org/10.7717/peerj.11308 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Gaubert-Boussarie et al. title: Robustness analysis of metabolic predictions in algal microbial communities based on different annotation pipelines link: https://peerj.com/articles/11344 last-modified: 2021-05-06 description: Animals, plants, and algae rely on symbiotic microorganisms for their development and functioning. Genome sequencing and genomic analyses of these microorganisms provide opportunities to construct metabolic networks and to analyze the metabolism of the symbiotic communities they constitute. Genome-scale metabolic network reconstructions rest on information gained from genome annotation. As there are multiple annotation pipelines available, the question arises to what extent differences in annotation pipelines impact outcomes of these analyses. Here, we compare five commonly used pipelines (Prokka, MaGe, IMG, DFAST, RAST) from predicted annotation features (coding sequences, Enzyme Commission numbers, hypothetical proteins) to the metabolic network-based analysis of symbiotic communities (biochemical reactions, producible compounds, and selection of minimal complementary bacterial communities). While Prokka and IMG produced the most extensive networks, RAST and DFAST networks produced the fewest false positives and the most connected networks with the fewest dead-end metabolites. Our results underline differences between the outputs of the tested pipelines at all examined levels, with small differences in the draft metabolic networks resulting in the selection of different microbial consortia to expand the metabolic capabilities of the algal host. However, the consortia generated yielded similar predicted producible compounds and could therefore be considered functionally interchangeable. This contrast between selected communities and community functions depending on the annotation pipeline needs to be taken into consideration when interpreting the results of metabolic complementarity analyses. In the future, experimental validation of bioinformatic predictions will likely be crucial to both evaluate and refine the pipelines and needs to be coupled with increased efforts to expand and improve annotations in reference databases. creator: Elham Karimi creator: Enora Geslain creator: Arnaud Belcour creator: Clémence Frioux creator: Méziane Aïte creator: Anne Siegel creator: Erwan Corre creator: Simon M. Dittami uri: https://doi.org/10.7717/peerj.11344 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Karimi et al. title: Genome-wide identification, classification and expression profile analysis of the HSF gene family in Hypericum perforatum link: https://peerj.com/articles/11345 last-modified: 2021-05-06 description: Heat shock transcription factors (HSFs) are critical regulators of plant responses to various abiotic and biotic stresses, including high temperature stress. HSFs are involved in regulating the expression of heat shock proteins (HSPs) by binding with heat stress elements (HSEs) to defend against high-temperature stress. The H. perforatum genome was recently fully sequenced; this provides a valuable resource for genetic and functional analysis. In this study, 23 putative HpHSF genes were identified and divided into three groups (A, B, and C) based on phylogeny and structural features. Gene structure and conserved motif analyses were performed on HpHSFs members; the DNA-binding domain (DBD), hydrophobic heptad repeat (HR-A/B), and exon-intron boundaries exhibited specific phylogenetic relationships. In addition, the presence of various cis-acting elements in the promoter regions of HpHSFs underscored their regulatory function in abiotic stress responses. RT-qPCR analyses showed that most HpHSF genes were expressed in response to heat conditions, suggesting that HpHSFs play potential roles in the heat stress resistance pathway. Our findings are advantageous for the analysis and research of the function of HpHSFs in high temperature stress tolerance in H. perforatum. creator: Li Zhou creator: Xiaoding Yu creator: Donghao Wang creator: Lin Li creator: Wen Zhou creator: Qian Zhang creator: Xinrui Wang creator: Sumin Ye creator: Zhezhi Wang uri: https://doi.org/10.7717/peerj.11345 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Zhou et al. title: BACPHLIP: predicting bacteriophage lifestyle from conserved protein domains link: https://peerj.com/articles/11396 last-modified: 2021-05-06 description: Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages are capable of a latent phase of infection within a host cell (lysogenic cycle), whereas virulent phages directly replicate and lyse host cells upon infection (lytic cycle). Accurate lifestyle identification is critical for determining the role of individual phage species within ecosystems and their effect on host evolution. Here, we present BACPHLIP, a BACterioPHage LIfestyle Predictor. BACPHLIP detects the presence of a set of conserved protein domains within an input genome and uses this data to predict lifestyle via a Random Forest classifier that was trained on a dataset of 634 phage genomes. On an independent test set of 423 phages, BACPHLIP has an accuracy of 98% greatly exceeding that of the previously existing tools (79%). BACPHLIP is freely available on GitHub (https://github.com/adamhockenberry/bacphlip) and the code used to build and test the classifier is provided in a separate repository (https://github.com/adamhockenberry/bacphlip-model-dev) for users wishing to interrogate and re-train the underlying classification model. creator: Adam J. Hockenberry creator: Claus O. Wilke uri: https://doi.org/10.7717/peerj.11396 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Hockenberry and Wilke title: Construction of an immune-related signature with prognostic value for colon cancer link: https://peerj.com/articles/10812 last-modified: 2021-05-05 description: BackgroundColon cancer is the third most common malignant tumor in the world. Although immunotherapy has been used in cancer treatment, there is still no first-line immunotherapy method for colon cancer. Therefore, it is essential to search for potential immunotherapy targets and molecular biomarkers for early diagnosis and prognosis.MethodsIn this study, we downloaded transcriptome data from The Cancer Genome Atlas (TCGA) and immune-related genes from the ImmPort database. Then we filtered genes with prognostic value and constructed an immune-related signature. Patients were classified into low- and high-risk groups, and we exerted a series of analysis between the signature and clinical phenotypes. Additionally, we used protein-protein interaction networks, gene set enrichment analysis (GSEA) and single-sample gene-set enrichment analysis (ssGSEA) to explore the underlying mechanism of this signature. Furthermore, the accuracy of this signature was verified, using two data sets from Gene Expression Omnibus (GEO).ResultsWe selected 12 immune-related genes to construct the immune-related signature. Low-risk group had a higher level of immunity compared to high-risk group. The expression level of HLA genes and checkpoint-related genes were statistically different in low- and high-risk groups. This signature showed its prognostic value in TCGA cohort and 2 GEO data sets. The signature also had strong correlation with TNM classification, stage, survival state and lymphatic invasion. The mechanism of the signature may be related to several transcription factors and CD8+ T cell in the tumor microenvironment.ConclusionIn conclusion, this immune-related signature is of great prognosis value for colon cancer and its biofunction might be correlated with HLA genes, checkpoint-related genes and high-infiltrating T cells in tumor tissues. creator: Yunxia Lv creator: Xinyi Wang creator: Yu Ren creator: Xiaorui Fu creator: Taiyuan Li creator: Qunguang Jiang uri: https://doi.org/10.7717/peerj.10812 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Lv et al. title: Comparison of ultrafiltration and iron chloride flocculation in the preparation of aquatic viromes from contrasting sample types link: https://peerj.com/articles/11111 last-modified: 2021-05-05 description: Viral metagenomes (viromes) are a valuable untargeted tool for studying viral diversity and the central roles viruses play in host disease, ecology, and evolution. Establishing effective methods to concentrate and purify viral genomes prior to sequencing is essential for high quality viromes. Using virus spike-and-recovery experiments, we stepwise compared two common approaches for virus concentration, ultrafiltration and iron chloride flocculation, across diverse matrices: wastewater influent, wastewater secondary effluent, river water, and seawater. Viral DNA was purified by removing cellular DNA via chloroform cell lysis, filtration, and enzymatic degradation of extra-viral DNA. We found that viral genomes were concentrated 1-2 orders of magnitude more with ultrafiltration than iron chloride flocculation for all matrices and resulted in higher quality DNA suitable for amplification-free and long-read sequencing. Given its widespread use and utility as an inexpensive field method for virome sampling, we nonetheless sought to optimize iron flocculation. We found viruses were best concentrated in seawater with five-fold higher iron concentrations than the standard used, inhibition of DNase activity reduced purification effectiveness, and five-fold more iron was needed to flocculate viruses from freshwater than seawater—critical knowledge for those seeking to apply this broadly used method to freshwater virome samples. Overall, our results demonstrated that ultrafiltration and purification performed better than iron chloride flocculation and purification in the tested matrices. Given that the method performance depended on the solids content and salinity of the samples, we suggest spike-and-recovery experiments be applied when concentrating and purifying sample types that diverge from those tested here. creator: Kathryn Langenfeld creator: Kaitlyn Chin creator: Ariel Roy creator: Krista Wigginton creator: Melissa B. Duhaime uri: https://doi.org/10.7717/peerj.11111 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Langenfeld et al. title: Genome-wide identification and characterization of NAC genes in Brassica juncea var. tumida link: https://peerj.com/articles/11212 last-modified: 2021-05-05 description: BackgroundNAC (NAM, ATAF1/2, and CUC2) transcription factors play an important role in plant growth and development. However, in tumorous stem mustard (Brassica juncea var. tumida), one of the economically important crops cultivated in southwest China and some southeast Asian countries, reports on the identification of NAC family genes are lacking. In this study, we conducted a genome-wide investigation of the NAC family genes in B. juncea var. tumida, based on its recently published genome sequence data.MethodsThe NAC genes were identified in B. juncea var. tumida using the bioinformatics approach on the whole genome level. Additionally, the expression of BjuNAC genes was analyzed under high- and low-temperature stresses by quantitative real-time PCR (qRT-PCR).ResultsA total of 300 BjuNAC genes were identified, of which 278 were mapped to specific chromosomes. Phylogenetic analysis of B. juncea var. tumida, Brassica rapa, Brassica nigra, rice and Arabidopsis thaliana NAC proteins revealed that all NAC genes were divided into 18 subgroups. Furthermore, gene structure analysis showed that most of the NAC genes contained two or three exons. Conserved motif analysis revealed that BjuNAC genes contain a conserved NAM domain. Additionally, qRT-PCR data indicated that thirteen BjuNAC genes with a varying degree of up-regulation during high-temperature stress. Conversely, four BjuNAC genes (BjuNAC006, BjuNAC083, BjuNAC170 and BjuNAC223) were up-regulated and two BjuNAC genes (BjuNAC074 and BjuNAC295) down-regulated under low temperature, respectively. Together, the results of this study provide a strong foundation for future investigation of the biological function of NAC genes in B. juncea var. tumida. creator: Longxing Jiang creator: Quan Sun creator: Yu Wang creator: Pingan Chang creator: Haohuan Kong creator: Changshu Luo creator: Xiaohong He uri: https://doi.org/10.7717/peerj.11212 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Jiang et al. title: Identification of P2RY13 as an immune-related prognostic biomarker in lung adenocarcinoma: A public database-based retrospective study link: https://peerj.com/articles/11319 last-modified: 2021-05-05 description: BackgroundLung adenocarcinoma (LUAD) is the leading histological subtype of non-small cell lung cancer (NSCLC).MethodsIn the present study, the gene matrixes of LUAD were downloaded from The Cancer Genome Atlas to infer immune and stromal scores with the ‘Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data’ (ESTIMATE) algorithm and identified immune-related differentially expressed genes (DEGs) between the high- and low-stromal/immune score groups. Next, all DEGs were subjected to univariate Cox regression and survival analyses to screen out prognostic biomarkers in the tumor microenvironment (TME), and were validated in the Gene Expression Omnibus database. Single-sample gene set enrichment analysis (ssGSEA) was performed to assess the level of tumor-infiltrating immune cells (TIICs) and immune functions, and GSEA was used to identified pathways altered by prognostic biomarkers.ResultsSurvival analysis showed that LUAD in the high-immune and stromal score group had a better clinical prognosis. A total of 303 immune-related DEGs were detected. Univariate Cox regression and survival analyses revealed that P2Y purinoceptor 13 (P2RY13) was a favorable factor for the prognosis of LUAD. ssGSEA and Spearman correlation analysis demonstrated that P2RY13 was highly correlated with various TIICs and immune functions. Several immune-associated pathways were enriched between the high- and low-expression P2RY13 groups.ConclusionP2RY13 may be a potential prognostic indicator and is highly associated with the TME in LUAD. However, further experimental studies are required to validate the present findings. creator: Jiang Lin creator: Chunlei Wu creator: Dehua Ma creator: Quanteng Hu uri: https://doi.org/10.7717/peerj.11319 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Lin et al. title: Variance of vegetation coverage and its sensitivity to climatic factors in the Irtysh River basin link: https://peerj.com/articles/11334 last-modified: 2021-05-05 description: BackgroundClimate change is an important factor driving vegetation changes in arid areas. Identifying the sensitivity of vegetation to climate variability is crucial for developing sustainable ecosystem management strategies. The Irtysh River is located in the westerly partition of China, and its vegetation cover is more sensitive to climate change. However, previous studies rarely studied the changes in the vegetation coverage of the Irtysh River and its sensitivity to climate factors from a spatiotemporal perspective.MethodsWe adopted a vegetation sensitivity index based on remote sensing datasets of high temporal resolution to study the sensitivity of vegetation to climatic factors in the Irtysh River basin, then reveal the driving mechanism of vegetation cover change.ResultsThe results show that 88.09% of vegetated pixels show an increasing trend in vegetation coverage, and the sensitivity of vegetation to climate change presents spatial heterogeneity. Sensitivity of vegetation increases with the increase of coverage. Temperate steppe in the northern mountain and herbaceous swamp and broadleaf forest in the river valley, where the normalized difference vegetation index is the highest, show the strongest sensitivity, while the desert steppe in the northern plain, where the NDVI is the lowest, shows the strongest memory effect (or the strongest resilience). Relatively, the northern part of this area is more affected by a combination of precipitation and temperature, while the southern plains dominated by desert steppe are more sensitive to precipitation. The central river valley dominated by herbaceous swamp is more sensitive to temperature-vegetation dryness index. This study underscores that the sensitivity of vegetation cover to climate change is spatially differentiated at the regional scale. creator: Feifei Han creator: Junjie Yan creator: Hong-bo Ling uri: https://doi.org/10.7717/peerj.11334 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2021 Han et al.