title: PeerJ description: Articles published in PeerJ link: https://peerj.com/articles/index.rss3?journal=peerj&page=1740 creator: info@peerj.com PeerJ errorsTo: info@peerj.com PeerJ language: en title: Compacting and correcting Trinity and Oases RNA-Seq de novo assemblies link: https://peerj.com/articles/2988 last-modified: 2017-02-16 description: BackgroundDe novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved.ResultsWe built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1.3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts.ConclusionTranscriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an easy to use software package to produce compact and corrected transcript set. DRAP is free, open-source and available under GPL V3 license at http://www.sigenae.org/drap. creator: Cédric Cabau creator: Frédéric Escudié creator: Anis Djari creator: Yann Guiguen creator: Julien Bobe creator: Christophe Klopp uri: https://doi.org/10.7717/peerj.2988 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Cabau et al. title: Variant profiling of evolving prokaryotic populations link: https://peerj.com/articles/2997 last-modified: 2017-02-16 description: Genomic heterogeneity of bacterial species is observed and studied in experimental evolution experiments and clinical diagnostics, and occurs as micro-diversity of natural habitats. The challenge for genome research is to accurately capture this heterogeneity with the currently used short sequencing reads. Recent advances in NGS technologies improved the speed and coverage and thus allowed for deep sequencing of bacterial populations. This facilitates the quantitative assessment of genomic heterogeneity, including low frequency alleles or haplotypes. However, false positive variant predictions due to sequencing errors and mapping artifacts of short reads need to be prevented. We therefore created VarCap, a workflow for the reliable prediction of different types of variants even at low frequencies. In order to predict SNPs, InDels and structural variations, we evaluated the sensitivity and accuracy of different software tools using synthetic read data. The results suggested that the best sensitivity could be reached by a union of different tools, however at the price of increased false positives. We identified possible reasons for false predictions and used this knowledge to improve the accuracy by post-filtering the predicted variants according to properties such as frequency, coverage, genomic environment/localization and co-localization with other variants. We observed that best precision was achieved by using an intersection of at least two tools per variant. This resulted in the reliable prediction of variants above a minimum relative abundance of 2%. VarCap is designed for being routinely used within experimental evolution experiments or for clinical diagnostics. The detected variants are reported as frequencies within a VCF file and as a graphical overview of the distribution of the different variant/allele/haplotype frequencies. The source code of VarCap is available at https://github.com/ma2o/VarCap. In order to provide this workflow to a broad community, we implemeted VarCap on a Galaxy webserver, which is accessible at http://galaxy.csb.univie.ac.at. creator: Markus Zojer creator: Lisa N. Schuster creator: Frederik Schulz creator: Alexander Pfundner creator: Matthias Horn creator: Thomas Rattei uri: https://doi.org/10.7717/peerj.2997 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Zojer et al. title: Understanding morphological variability in a taxonomic context in Chilean diplomystids (Teleostei: Siluriformes), including the description of a new species link: https://peerj.com/articles/2991 last-modified: 2017-02-16 description: Following study of the external morphology and its unmatched variability throughout ontogeny and a re-examination of selected morphological characters based on many specimens of diplomystids from Central and South Chile, we revised and emended previous specific diagnoses and consider Diplomystes chilensis, D. nahuelbutaensis, D. camposensis, and Olivaichthys viedmensis (Baker River) to be valid species. Another group, previously identified as Diplomystes sp., D. spec., D. aff. chilensis, and D. cf. chilensis inhabiting rivers between Rapel and Itata Basins is given a new specific name (Diplomystes incognitus) and is diagnosed. An identification key to the Chilean species, including the new species, is presented. All specific diagnoses are based on external morphological characters, such as aspects of the skin, neuromast lines, and main lateral line, and position of the anus and urogenital pore, as well as certain osteological characters to facilitate the identification of these species that previously was based on many internal characters. Diplomystids below 150 mm standard length (SL) share a similar external morphology and body proportions that make identification difficult; however, specimens over 150 mm SL can be diagnosed by the position of the urogenital pore and anus, and a combination of external and internal morphological characters. According to current knowledge, diplomystid species have an allopatric distribution with each species apparently endemic to particular basins in continental Chile and one species (O. viedmensis) known only from one river in the Chilean Patagonia, but distributed extensively in southern Argentina. creator: Gloria Arratia creator: Claudio Quezada-Romegialli uri: https://doi.org/10.7717/peerj.2991 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Arratia and Quezada-Romegialli title: High richness of insect herbivory from the early Miocene Hindon Maar crater, Otago, New Zealand link: https://peerj.com/articles/2985 last-modified: 2017-02-16 description: Plants and insects are key components of terrestrial ecosystems and insect herbivory is the most important type of interaction in these ecosystems. This study presents the first analysis of associations between plants and insects for the early Miocene Hindon Maar fossil lagerstätte, Otago, New Zealand. A total of 584 fossil angiosperm leaves representing 24 morphotypes were examined to determine the presence or absence of insect damage types. Of these leaves, 73% show signs of insect damage; they comprise 821 occurrences of damage from 87 damage types representing all eight functional feeding groups. In comparison to other fossil localities, the Hindon leaves display a high abundance of insect damage and a high diversity of damage types. Leaves of Nothofagus(southern beech), the dominant angiosperm in the fossil assemblage, exhibit a similar leaf damage pattern to leaves from the nearby mid to late Miocene Dunedin Volcano Group sites but display a more diverse spectrum and much higher percentage of herbivory damage than a comparable dataset of leaves from Palaeocene and Eocene sites in the Antarctic Peninsula. creator: Anna Lena Möller creator: Uwe Kaulfuss creator: Daphne E. Lee creator: Torsten Wappler uri: https://doi.org/10.7717/peerj.2985 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Möller et al. title: Exoskeletons of Bougainvilliidae and other Hydroidolina (Cnidaria, Hydrozoa): structure and composition link: https://peerj.com/articles/2964 last-modified: 2017-02-16 description: The exoskeleton is an important source of characters for the taxonomy of Hydroidolina. It originates as epidermal secretions and, among other functions, protects the coenosarc of the polypoid stage. However, comparative studies on the exoskeletal tissue origin, development, chemical, and structural characteristics, as well as its evolution and homology, are few and fragmented. This study compares the structure and composition of the exoskeleton and underlying coenosarc in members of “Anthoathecata” and some Leptothecata, but does so mainly in bougainvilliid polyps histological analyses. We also studied the development of the exoskeleton under experimental conditions. We identified three types of glandular epidermal cells related to the origin of the exoskeleton and the secretion of its polysaccharides component. The exoskeleton of the species studied is either bilayered (perisarc and exosarc, especially in bougainvilliids) or corneous (perisarc). The exoskeleton varies in chemical composition, structural rigidity, thickness, extension, and coverage in the different regions of the colony. In bilayered exoskeletons, the exosarc is produced first and appears to be a key step in the formation of the rigid exoskeleton. The exoskeleton contains anchoring structures such as desmocytes and “perisarc extensions.” creator: María A. Mendoza-Becerril creator: José Eduardo A.R. Marian creator: Alvaro Esteves Migotto creator: Antonio Carlos Marques uri: https://doi.org/10.7717/peerj.2964 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Mendoza-Becerril et al. title: Small-scale spatial variation in population- and individual-level reproductive parameters of the blue-legged hermit crab Clibanarius tricolor link: https://peerj.com/articles/3004 last-modified: 2017-02-15 description: Management of the few regulated ornamental fisheries relies on inadequate information about the life history of the target species. Herein, we investigated the reproductive biology of the most heavily traded marine invertebrate in the western Atlantic; the blue-legged hermit crab Clibanarius tricolor. We report on density, individual-level, and population-level reproductive parameters in 14 populations spanning the Florida Keys. In C. tricolor, abundance, population-level, and individual-level reproductive parameters exhibited substantial small-scale spatial variation in the Florida Keys. For instance, the proportion of brooding females varied between 10–94% across localities. In females, average (±SD) fecundity varied between 184 (±54) and 614 (±301) embryos crab-1 across populations. Fecundity usually increases with female body size in hermit crabs. However, we found no effect of female body size on fecundity in three of the populations. Altogether, our observations suggest that C. tricolor may fit a source-sink metapopulation dynamic in the Florida Keys with low reproductive intensity and absence of a parental body size—fecundity relationship resulting in net reproductive loses at some localities. We argue in favor of additional studies describing population dynamics and other aspects of the natural history of C. tricolor (e.g., development type, larval duration) to reveal ‘source’ populations, capable of exporting larvae to nearby populations. Our observations imply that future studies aimed at assessing standing stocks or describing other aspects of the life history of this hermit crab need to focus on multiple localities simultaneously. This and future studies on the reproductive biology of this species will form the baseline for models aimed at assessing the stock condition and sustainability of this heavily harvested crustacean. creator: J. Antonio Baeza creator: Donald C. Behringer uri: https://doi.org/10.7717/peerj.3004 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Baeza and Behringer title: Risk analysis of colorectal cancer incidence by gene expression analysis link: https://peerj.com/articles/3003 last-modified: 2017-02-15 description: BackgroundColorectal cancer (CRC) is one of the leading cancers worldwide. Several studies have performed microarray data analyses for cancer classification and prognostic analyses. Microarray assays also enable the identification of gene signatures for molecular characterization and treatment prediction.ObjectiveMicroarray gene expression data from the online Gene Expression Omnibus (GEO) database were used to to distinguish colorectal cancer from normal colon tissue samples.MethodsWe collected microarray data from the GEO database to establish colorectal cancer microarray gene expression datasets for a combined analysis. Using the Prediction Analysis for Microarrays (PAM) method and the GSEA MSigDB resource, we analyzed the 14,698 genes that were identified through an examination of their expression values between normal and tumor tissues.ResultsTen genes (ABCG2, AQP8, SPIB, CA7, CLDN8, SCNN1B, SLC30A10, CD177, PADI2, and TGFBI) were found to be good indicators of the candidate genes that correlate with CRC. From these selected genes, an average of six significant genes were obtained using the PAM method, with an accuracy rate of 95%. The results demonstrate the potential of utilizing a model with the PAM method for data mining. After a detailed review of the published reports, the results confirmed that the screened candidate genes are good indicators for cancer risk analysis using the PAM method.ConclusionsSix genes were selected with 95% accuracy to effectively classify normal and colorectal cancer tissues. We hope that these results will provide the basis for new research projects in clinical practice that aim to rapidly assess colorectal cancer risk using microarray gene expression analysis. creator: Wei-Chuan Shangkuan creator: Hung-Che Lin creator: Yu-Tien Chang creator: Chen-En Jian creator: Hueng-Chuen Fan creator: Kang-Hua Chen creator: Ya-Fang Liu creator: Huan-Ming Hsu creator: Hsiu-Ling Chou creator: Chung-Tay Yao creator: Chi-Ming Chu creator: Sui-Lung Su creator: Chi-Wen Chang uri: https://doi.org/10.7717/peerj.3003 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Shangkuan et al. title: Effect of interleukin (IL)-35 on IL-17 expression and production by human CD4+ T cells link: https://peerj.com/articles/2999 last-modified: 2017-02-15 description: BackgroundInterleukin (IL)-17 produced by mainly T helper 17 (Th17) cells may play an important destructive role in chronic periodontitis (CP). Thus, anti-inflammatory cytokines, such as IL-35, might have a beneficial effect in periodontitis by inhibiting differentiation of Th17 cells. Th17 differentiation is regulated by the retinoic acid receptor-related orphan receptor (ROR) α (encoded by RORA) and RORγt (encoded by RORC). However, the role of IL-35 in periodontitis is not clear and the effect of IL-35 on the function of Th17 cells is still incompletely understood. Therefore, we investigated the effects of IL-35 on Th17 cells.MethodsPeripheral blood mononuclear cells (PBMCs) were sampled from three healthy volunteers and three CP patients and were analyzed by flow cytometry for T cell population. Th17 cells differentiated by a cytokine cocktail (recombinant transforming growth factor-β, rIL-6, rIL-1β, anti-interferon (IFN)-γ, anti-IL-2 and anti-IL-4) from PBMCs were cultured with or without rIL-35. IL17A (which usually refers to IL-17), RORA and RORCmRNA expression was analyzed by quantitative polymerase chain reaction, and IL-17A production was determined by enzyme-linked immunosorbent assay.ResultsThe proportion of IL-17A+CD4+ slightly increased in CP patients compared with healthy controls, however, there were no significant differences in the percentage of IL-17A+CD4+ as well as IFN-γ+CD4+ and Foxp3+CD4+ T cells between healthy controls and CP patients. IL17A, RORA and RORC mRNA expression was significantly increased in Th17 cells induced by the cytokine cocktail, and the induction was significantly inhibited by addition of rIL-35 (1 ng/mL). IL-17A production in Th17 cells was significantly inhibited by rIL-35 addition (1 ng/mL).DiscussionThe present study suggests that IL-35 could directly suppress IL-17 expression via RORα and RORγt inhibition and might play an important role in inflammatory diseases such as periodontitis. creator: Kosuke Okada creator: Takeki Fujimura creator: Takeshi Kikuchi creator: Makoto Aino creator: Yosuke Kamiya creator: Ario Izawa creator: Yuki Iwamura creator: Hisashi Goto creator: Iichiro Okabe creator: Eriko Miyake creator: Yoshiaki Hasegawa creator: Makio Mogi creator: Akio Mitani uri: https://doi.org/10.7717/peerj.2999 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Okada et al. title: ToxGen: an improved reference database for the identification of type B-trichothecene genotypes in Fusarium link: https://peerj.com/articles/2992 last-modified: 2017-02-15 description: Type B trichothecenes, which pose a serious hazard to consumer health, occur worldwide in grains. These mycotoxins are produced mainly by three different trichothecene genotypes/chemotypes: 3ADON (3-acetyldeoxynivalenol), 15ADON (15-acetyldeoxynivalenol) and NIV (nivalenol), named after these three major mycotoxin compounds. Correct identification of these genotypes is elementary for all studies relating to population surveys, fungal ecology and mycotoxicology. Trichothecene producers exhibit enormous strain-dependent chemical diversity, which may result in variation in levels of the genotype’s determining toxin and in the production of low to high amounts of atypical compounds. New high-throughput DNA-sequencing technologies promise to boost the diagnostics of mycotoxin genotypes. However, this requires a reference database containing a satisfactory taxonomic sampling of sequences showing high correlation to actually produced chemotypes. We believe that one of the most pressing current challenges of such a database is the linking of molecular identification with chemical diversity of the strains, as well as other metadata. In this study, we use the Tri12 gene involved in mycotoxin biosynthesis for identification of Tri genotypes through sequence comparison. Tri12 sequences from a range of geographically diverse fungal strains comprising 22 Fusarium species were stored in the ToxGen database, which covers descriptive and up-to-date annotations such as indication on Tri genotype and chemotype of the strains, chemical diversity, information on trichothecene-inducing host, substrate or media, geographical locality, and most recent taxonomic affiliations. The present initiative bridges the gap between the demands of comprehensive studies on trichothecene producers and the existing nucleotide sequence databases, which lack toxicological and other auxiliary data. We invite researchers working in the fields of fungal taxonomy, epidemiology and mycotoxicology to join the freely available annotation effort. creator: Tomasz Kulik creator: Kessy Abarenkov creator: Maciej Buśko creator: Katarzyna Bilska creator: Anne D. van Diepeningen creator: Anna Ostrowska-Kołodziejczak creator: Katarzyna Krawczyk creator: Balázs Brankovics creator: Sebastian Stenglein creator: Jakub Sawicki creator: Juliusz Perkowski uri: https://doi.org/10.7717/peerj.2992 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Kulik et al. title: Opportunities drive the global distribution of protected areas link: https://peerj.com/articles/2989 last-modified: 2017-02-15 description: BackgroundProtected areas, regarded today as a cornerstone of nature conservation, result from an array of multiple motivations and opportunities. We explored at global and regional levels the current distribution of protected areas along biophysical, human, and biological gradients, and assessed to what extent protection has pursued (i) a balanced representation of biophysical environments, (ii) a set of preferred conditions (biological, spiritual, economic, or geopolitical), or (iii) existing opportunities for conservation regardless of any representation or preference criteria.MethodsWe used histograms to describe the distribution of terrestrial protected areas along biophysical, human, and biological independent gradients and linear and non-linear regression and correlation analyses to describe the sign, shape, and strength of the relationships. We used a random forest analysis to rank the importance of different variables related to conservation preferences and opportunity drivers, and an evenness metric to quantify representativeness.ResultsWe find that protection at a global level is primarily driven by the opportunities provided by isolation and a low population density (variable importance = 34.6 and 19.9, respectively). Preferences play a secondary role, with a bias towards tourism attractiveness and proximity to international borders (variable importance = 12.7 and 3.4, respectively). Opportunities shape protection strongly in “North America & Australia–NZ” and “Latin America & Caribbean,” while the importance of the representativeness of biophysical environments is higher in “Sub-Saharan Africa” (1.3 times the average of other regions).DiscussionEnvironmental representativeness and biodiversity protection are top priorities in land conservation agendas. However, our results suggest that they have been minor players driving current protection at both global and regional levels. Attempts to increase their relevance will necessarily have to recognize the predominant opportunistic nature that the establishment of protected areas has had until present times. creator: Germán Baldi creator: Marcos Texeira creator: Osvaldo A. Martin creator: H. Ricardo Grau creator: Esteban G. Jobbágy uri: https://doi.org/10.7717/peerj.2989 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2017 Baldi et al.