title: PeerJ description: Articles published in PeerJ link: https://peerj.com/articles/index.rss3?journal=peerj&page=1106 creator: info@peerj.com PeerJ errorsTo: info@peerj.com PeerJ language: en title: Predicting the effect of variants on splicing using Convolutional Neural Networks link: https://peerj.com/articles/9470 last-modified: 2020-07-06 description: Mutations that cause an error in the splicing of a messenger RNA (mRNA) can lead to diseases in humans. Various computational models have been developed to recognize the sequence pattern of the splice sites. In recent studies, Convolutional Neural Network (CNN) architectures were shown to outperform other existing models in predicting the splice sites. However, an insufficient effort has been put into extending the CNN model to predict the effect of the genomic variants on the splicing of mRNAs. This study proposes a framework to elaborate on the utility of CNNs to assess the effect of splice variants on the identification of potential disease-causing variants that disrupt the RNA splicing process. Five models, including three CNN-based and two non-CNN machine learning based, were trained and compared using two existing splice site datasets, Genome Wide Human splice sites (GWH) and a dataset provided at the Deep Learning and Artificial Intelligence winter school 2018 (DLAI). The donor sites were also used to test on the HSplice tool to evaluate the predictive models. To improve the effectiveness of predictive models, two datasets were combined. The CNN model with four convolutional layers showed the best splice site prediction performance with an AUPRC of 93.4% and 88.8% for donor and acceptor sites, respectively. The effects of variants on splicing were estimated by applying the best model on variant data from the ClinVar database. Based on the estimation, the framework could effectively differentiate pathogenic variants from the benign variants (p = 5.9 × 10−7). These promising results support that the proposed framework could be applied in future genetic studies to identify disease causing loci involving the splicing mechanism. The datasets and Python scripts used in this study are available on the GitHub repository at https://github.com/smiile8888/rna-splice-sites-recognition. creator: Thanyathorn Thanapattheerakul creator: Worrawat Engchuan creator: Jonathan H. Chan uri: https://doi.org/10.7717/peerj.9470 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2020 Thanapattheerakul et al. title: Evaluation of psychological stress in scientific researchers during the 2019–2020 COVID-19 outbreak in China link: https://peerj.com/articles/9497 last-modified: 2020-07-06 description: BackgroundBeginning in December 2019, coronavirus disease 2019 (COVID-19) caused an outbreak of infectious pneumonia. The Chinese government introduced a series of grounding measures to prevent the spread of COVID-19. The living and working patterns of many scientific researchers also underwent significant changes during this period.MethodsAn opportunity sample (n = 251) was obtained in China using a questionnaire with 42 questions on scientific research progress and psychological stress during the COVID-19 epidemic.ResultsOf the 251 participants, 76.9% indicated that their research was affected by the COVID-19 outbreak, and participants who were affected by the outbreak had higher stress levels than those who were not affected. Participants who conducted COVID-19 research and indicated concern that they would fail to finish the research on time were more likely to indicate high levels of stress. Respondents indicated that extending deadlines (64.1%), receiving support from superiors for research (51.8%), and increasing benefits for researchers (51.0%) would likely relieve outbreak-related stress.ConclusionThe COVID-19 outbreak had a major impact on the experiments of researchers in the life sciences, especially in basic and clinical medicine. It has also caused high levels of psychological stress in these populations. Measures should be taken to relieve psychological pressure on basic medical researchers and students who will soon complete their degrees (e.g., Master’s and PhD candidates in graduation years). creator: Xueyan Zhang creator: Xinyu Li creator: Zhenxin Liao creator: Mingyi Zhao creator: Quan Zhuang uri: https://doi.org/10.7717/peerj.9497 license: https://creativecommons.org/licenses/by/4.0/ rights: © 2020 Zhang et al. title: Novel circulating protein biomarkers for thyroid cancer determined through data-independent acquisition mass spectrometry link: https://peerj.com/articles/9507 last-modified: 2020-07-06 description: BackgroundDistinguishing between different types of thyroid cancers (TC) remains challenging in clinical laboratories. As different tumor types require different clinical interventions, it is necessary to establish new methods for accurate diagnosis of TC.MethodsProteomic analysis of the human serum was performed through data-independent acquisition mass spectrometry for 29 patients with TC (stages I–IV): 13 cases of papillary TC (PTC), 10 cases of medullary TC (MTC), and six cases follicular TC (FTC). In addition, 15 patients with benign thyroid nodules (TNs) and 10 healthy controls (HCs) were included in this study. Subsequently, 17 differentially expressed proteins were identified in 291 patients with TC, including 247 with PTC, 38 with MTC, and six with FTC, and 69 patients with benign TNs and 176 with HC, using enzyme-linked immunosorbent assays.ResultsIn total, 517 proteins were detected in the serum samples using an Orbitrap Q-Exactive-plus mass spectrometer. The amyloid beta A4 protein, apolipoprotein A-IV, gelsolin, contactin-1, gamma-glutamyl hydrolase, and complement factor H-related protein 1 (CFHR1) were selected for further analysis. The median serum CFHR1 levels were significantly higher in the MTC and FTC groups than in the PTC and control groups (P < 0.001). CFHR1 exhibited higher diagnostic performance in distinguishing patients with MTC from those with PTC (P < 0.001), with a sensitivity of 100.0%, specificity of 85.08%, area under the curve of 0.93, and detection cut-off of 0.92 ng/mL.ConclusionCFHR1 may serve as a novel biomarker to distinguish PTC from MTC with high sensitivity and specificity. creator: Dandan Li creator: Jie Wu creator: Zhongjuan Liu creator: Ling Qiu creator: Yimin Zhang uri: https://doi.org/10.7717/peerj.9507 license: https://creativecommons.org/licenses/by/4.0/ rights: © 2020 Li et al. title: Association between the number and size of intrapulmonary lymph nodes and chronic obstructive pulmonary disease severity link: https://peerj.com/articles/9166 last-modified: 2020-07-03 description: PurposeOne of the main pathophysiological mechanisms of chronic obstructive pulmonary disease is inflammation, which has been associated with lymphadenopathy. Intrapulmonary lymph nodes can be identified on CT as perifissural nodules (PFN). We investigated the association between the number and size of PFNs and measures of COPD severity.Materials and MethodsCT images were obtained from COPDGene. 50 subjects were randomly selected per GOLD stage (0 to 4), GOLD-unclassified, and never-smoker groups and allocated to either “Healthy,” “Mild,” or “Moderate/severe” groups. 26/350 (7.4%) subjects had missing images and were excluded. Supported by computer-aided detection, a trained researcher prelocated non-calcified opacities larger than 3 mm in diameter. Included lung opacities were classified independently by two radiologists as either “PFN,” “not a PFN,” “calcified,” or “not a nodule”; disagreements were arbitrated by a third radiologist. Ordinal logistic regression was performed as the main statistical test.ResultsA total of 592 opacities were included in the observer study. A total of 163/592 classifications (27.5%) required arbitration. A total of 17/592 opacities (2.9%) were excluded from the analysis because they were not considered nodular, were calcified, or all three radiologists disagreed. A total of 366/575 accepted nodules (63.7%) were considered PFNs. A maximum of 10 PFNs were found in one image; 154/324 (47.5%) contained no PFNs. The number of PFNs per subject did not differ between COPD severity groups (p = 0.50). PFN short-axis diameter could significantly distinguish between the Mild and Moderate/severe groups, but not between the Healthy and Mild groups (p = 0.021).ConclusionsThere is no relationship between PFN count and COPD severity. There may be a weak trend of larger intrapulmonary lymph nodes among patients with more advanced stages of COPD. creator: Anton Schreuder creator: Colin Jacobs creator: Ernst T. Scholten creator: Mathias Prokop creator: Bram van Ginneken creator: David A. Lynch creator: Cornelia M. Schaefer-Prokop uri: https://doi.org/10.7717/peerj.9166 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2020 Schreuder et al. title: Genome-wide analysis of basic helix-loop-helix transcription factors in papaya (Carica papaya L.) link: https://peerj.com/articles/9319 last-modified: 2020-07-03 description: The basic helix-loop-helix (bHLH) transcription factors (TFs) have been identified and functionally characterized in many plants. However, no comprehensive analysis of the bHLH family in papaya (Carica papaya L.) has been reported previously. Here, a total of 73 CpbHLHs were identified in papaya, and these genes were classified into 18 subfamilies based on phylogenetic analysis. Almost all of the CpbHLHs in the same subfamily shared similar gene structures and protein motifs according to analysis of exon/intron organizations and motif compositions. The number of exons in CpbHLHs varied from one to 10 with an average of five. The amino acid sequences of the bHLH domains were quite conservative, especially Leu-27 and Leu-63. Promoter cis-element analysis revealed that most of the CpbHLHs contained cis-elements that can respond to various biotic/abiotic stress-related events. Gene ontology (GO) analysis revealed that CpbHLHs mainly functions in protein dimerization activity and DNA-binding, and most CpbHLHs were predicted to localize in the nucleus. Abiotic stress treatment and quantitative real-time PCR (qRT-PCR) revealed some important candidate CpbHLHs that might be responsible for abiotic stress responses in papaya. These findings would lay a foundation for further investigate of the molecular functions of CpbHLHs. creator: Min Yang creator: Chenping Zhou creator: Hu Yang creator: Ruibin Kuang creator: Bingxiong Huang creator: Yuerong Wei uri: https://doi.org/10.7717/peerj.9319 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2020 Yang et al. title: Addressing incomplete lineage sorting and paralogy in the inference of uncertain salmonid phylogenetic relationships link: https://peerj.com/articles/9389 last-modified: 2020-07-03 description: Recent and continued progress in the scale and sophistication of phylogenetic research has yielded substantial advances in knowledge of the tree of life; however, segments of that tree remain unresolved and continue to produce contradicting or unstable results. These poorly resolved relationships may be the product of methodological shortcomings or of an evolutionary history that did not generate the signal traits needed for its eventual reconstruction. Relationships within the euteleost fish family Salmonidae have proven challenging to resolve in molecular phylogenetics studies in part due to ancestral autopolyploidy contributing to conflicting gene trees. We examine a sequence capture dataset from salmonids and use alternative strategies to accommodate the effects of gene tree conflict based on aspects of salmonid genome history and the multispecies coalescent. We investigate in detail three uncertain relationships: (1) subfamily branching, (2) monophyly of Coregonus and (3) placement of Parahucho. Coregoninae and Thymallinae are resolved as sister taxa, although conflicting topologies are found across analytical strategies. We find inconsistent and generally low support for the monophyly of Coregonus, including in results of analyses with the most extensive dataset and complex model. The most consistent placement of Parahucho is as sister lineage of Salmo. creator: Matthew A. Campbell creator: Thaddaeus J. Buser creator: Michael E. Alfaro creator: J. Andrés López uri: https://doi.org/10.7717/peerj.9389 license: https://creativecommons.org/licenses/by/4.0/ rights: © 2020 Campbell et al. title: A little frog leaps a long way: compounded colonizations of the Indian Subcontinent discovered in the tiny Oriental frog genus Microhyla (Amphibia: Microhylidae) link: https://peerj.com/articles/9411 last-modified: 2020-07-03 description: Frogs of the genus Microhyla include some of the world’s smallest amphibians and represent the largest radiation of Asian microhylids, currently encompassing 50 species, distributed across the Oriental biogeographic region. The genus Microhyla remains one of the taxonomically most challenging groups of Asian frogs and was found to be paraphyletic with respect to large-sized fossorial Glyphoglossus. In this study we present a time-calibrated phylogeny for frogs in the genus Microhyla, and discuss taxonomy, historical biogeography, and morphological evolution of these frogs. Our updated phylogeny of the genus with nearly complete taxon sampling includes 48 nominal Microhyla species and several undescribed candidate species. Phylogenetic analyses of 3,207 bp of combined mtDNA and nuDNA data recovered three well-supported groups: the Glyphoglossus clade, Southeast Asian Microhyla II clade (includes M. annectens species group), and a diverse Microhyla I clade including all other species. Within the largest major clade of Microhyla are seven well-supported subclades that we identify as the M. achatina, M. fissipes, M. berdmorei, M. superciliaris, M. ornata, M. butleri, and M. palmipes species groups. The phylogenetic position of 12 poorly known Microhyla species is clarified for the first time. These phylogenetic results, along with molecular clock and ancestral area analyses, show the Microhyla—Glyphoglossus assemblage to have originated in Southeast Asia in the middle Eocene just after the first hypothesized land connections between the Indian Plate and the Asian mainland. While Glyphoglossus and Microhyla II remained within their ancestral ranges, Microhyla I expanded its distribution generally east to west, colonizing and diversifying through the Cenozoic. The Indian Subcontinent was colonized by members of five Microhyla species groups independently, starting with the end Oligocene—early Miocene that coincides with an onset of seasonally dry climates in South Asia. Body size evolution modeling suggests that four groups of Microhyla have independently achieved extreme miniaturization with adult body size below 15 mm. Three of the five smallest Microhyla species are obligate phytotelm-breeders and we argue that their peculiar reproductive biology may be a factor involved in miniaturization. Body size increases in Microhyla—Glyphoglossus seem to be associated with a burrowing adaptation to seasonally dry habitats. Species delimitation analyses suggest a vast underestimation of species richness and diversity in Microhyla and reveal 15–33 undescribed species. We revalidate M. nepenthicola, synonymize M. pulverata with M. marmorata, and provide insights on taxonomic statuses of a number of poorly known species. Further integrative studies, combining evidence from phylogeny, morphology, advertisement calls, and behavior will result in a better systematic understanding of this morphologically cryptic radiation of Asian frogs. creator: Vladislav A. Gorin creator: Evgeniya N. Solovyeva creator: Mahmudul Hasan creator: Hisanori Okamiya creator: D.M.S. Suranjan Karunarathna creator: Parinya Pawangkhanant creator: Anslem de Silva creator: Watinee Juthong creator: Konstantin D. Milto creator: Luan Thanh Nguyen creator: Chatmongkon Suwannapoom creator: Alexander Haas creator: David P. Bickford creator: Indraneil Das creator: Nikolay A. Poyarkov uri: https://doi.org/10.7717/peerj.9411 license: https://creativecommons.org/licenses/by/4.0/ rights: © 2020 Gorin et al. title: The impact of short-term exposure to near shore stressors on the early life stages of the reef building coral Montipora capitata link: https://peerj.com/articles/9415 last-modified: 2020-07-03 description: Successful reproduction and survival are crucial to the continuation and resilience of corals globally. As reef waters warm due to climate change, episodic largescale tropical storms are becoming more frequent, drastically altering the near shore water quality for short periods of time. Therefore, it is critical that we understand the effects warming waters, fresh water input, and run-off have on sexual reproduction of coral. To better understand the effects of these near shore stressors on Hawaiian coral, laboratory experiments were conducted at the Institute of Marine Biology to determine the independent effects of suspended sediment concentrations (100 mg l−1 and 200 mg l−1), lowered salinity (28‰), and elevated temperature (31 °C) on the successful fertilization, larval survival, and settlement of the scleractinian coral Montipora capitata. In the present study, early developmental stages of coral were exposed to one of three near shore stressors for a period of 24 h and the immediate (fertilization) and latent effects (larval survival and settlement) were observed and measured. Fertilization success and settlement were not affected by any of the treatments; however, larval survival was negatively affected by all of the treatments by 50% or greater (p > 0.05). These data show that early life stages of M. capitata may be impacted by near shore stressors associated with warming and more frequent storm events. creator: Claire V.A. Lager creator: Mary Hagedorn creator: Kuʻulei S. Rodgers creator: Paul L. Jokiel uri: https://doi.org/10.7717/peerj.9415 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2020 Lager et al. title: A comparative analysis of the complete chloroplast genomes of three Chrysanthemum boreale strains link: https://peerj.com/articles/9448 last-modified: 2020-07-03 description: BackgroundChrysanthemum boreale Makino (Anthemideae, Asteraceae) is a plant of economic, ornamental and medicinal importance. We characterized and compared the chloroplast genomes of three C. boreale strains. These were collected from different geographic regions of Korea and varied in floral morphology.MethodsThe chloroplast genomes were obtained by next-generation sequencing techniques, assembled de novo, annotated, and compared with one another. Phylogenetic analysis placed them within the Anthemideae tribe.ResultsThe sizes of the complete chloroplast genomes of the C. boreale strains were 151,012 bp (strain 121002), 151,098 bp (strain IT232531) and 151,010 bp (strain IT301358). Each genome contained 80 unique protein-coding genes, 4 rRNA genes and 29 tRNA genes. Comparative analyses revealed a high degree of conservation in the overall sequence, gene content, gene order and GC content among the strains. We identified 298 single nucleotide polymorphisms (SNPs) and 106 insertions/deletions (indels) in the chloroplast genomes. These variations were more abundant in non-coding regions than in coding regions. Long dispersed repeats and simple sequence repeats were present in both coding and noncoding regions, with greater frequency in the latter. Regardless of their location, these repeats can be used for molecular marker development. Phylogenetic analysis revealed the evolutionary relationship of the species in the Anthemideae tribe. The three complete chloroplast genomes will be valuable genetic resources for studying the population genetics and evolutionary relationships of Asteraceae species. creator: Swati Tyagi creator: Jae-A Jung creator: Jung Sun Kim creator: So Youn Won uri: https://doi.org/10.7717/peerj.9448 license: https://creativecommons.org/licenses/by/4.0/ rights: © 2020 Tyagi et al. title: Treating coral bleaching as weather: a framework to validate and optimize prediction skill link: https://peerj.com/articles/9449 last-modified: 2020-07-03 description: Few coral reefs remain unscathed by mass bleaching over the past several decades, and much of the coral reef science conducted today relates in some way to the causes, consequences, or recovery pathways of bleaching events. Most studies portray a simple cause and effect relationship between anomalously high summer temperatures and bleaching, which is understandable given that bleaching rarely occurs outside these unusually warm times. However, the statistical skill with which temperature captures bleaching is hampered by many “false alarms”, times when temperatures reached nominal bleaching levels, but bleaching did not occur. While these false alarms are often not included in global bleaching assessments, they offer valuable opportunities to improve predictive skill, and therefore understanding, of coral bleaching events. Here, I show how a statistical framework adopted from weather forecasting can optimize bleaching predictions and validate which environmental factors play a role in bleaching susceptibility. Removing the 1 °C above the maximum monthly mean cutoff in the typical degree heating weeks (DHW) definition, adjusting the DHW window from 12 to 9 weeks, using regional-specific DHW thresholds, and including an El Niño threshold already improves the model skill by 45%. Most importantly, this framework enables hypothesis testing of other factors or metrics that may improve our ability to forecast coral bleaching events. creator: Thomas M. DeCarlo uri: https://doi.org/10.7717/peerj.9449 license: https://creativecommons.org/licenses/by/4.0/ rights: ©2020 DeCarlo