title: PeerJ description: Articles published in PeerJ link: https://peerj.com/articles/index.rss3?journal=peerj&page=1572 creator: info@peerj.com PeerJ errorsTo: info@peerj.com PeerJ language: en title: EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation link: https://peerj.com/articles/4750 last-modified: 2018-05-04 description: During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank (PDB) has increased more than 15-fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence, however, is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The two-layer architecture was investigated on a large dataset of 63,558 enzymes from the PDB and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet. creator: Afshine Amidi creator: Shervine Amidi creator: Dimitrios Vlachakis creator: Vasileios Megalooikonomou creator: Nikos Paragios creator: Evangelia I. Zacharaki uri: https://doi.org/10.7717/peerj.4750 license: http://creativecommons.org/licenses/by/4.0/ rights: © 2018 Amidi et al. title: Temporally-aware algorithms for the classification of anuran sounds link: https://peerj.com/articles/4732 last-modified: 2018-05-04 description: Several authors have shown that the sounds of anurans can be used as an indicator of climate change. Hence, the recording, storage and further processing of a huge number of anuran sounds, distributed over time and space, are required in order to obtain this indicator. Furthermore, it is desirable to have algorithms and tools for the automatic classification of the different classes of sounds. In this paper, six classification methods are proposed, all based on the data-mining domain, which strive to take advantage of the temporal character of the sounds. The definition and comparison of these classification methods is undertaken using several approaches. The main conclusions of this paper are that: (i) the sliding window method attained the best results in the experiments presented, and even outperformed the hidden Markov models usually employed in similar applications; (ii) noteworthy overall classification performance has been obtained, which is an especially striking result considering that the sounds analysed were affected by a highly noisy background; (iii) the instance selection for the determination of the sounds in the training dataset offers better results than cross-validation techniques; and (iv) the temporally-aware classifiers have revealed that they can obtain better performance than their non-temporally-aware counterparts. creator: Amalia Luque creator: Javier Romero-Lemos creator: Alejandro Carrasco creator: Luis Gonzalez-Abril uri: https://doi.org/10.7717/peerj.4732 license: http://creativecommons.org/licenses/by/4.0/ rights: © 2018 Luque et al. title: The pitfalls of short-range endemism: high vulnerability to ecological and landscape traps link: https://peerj.com/articles/4715 last-modified: 2018-05-04 description: Ecological traps attract biota to low-quality habitats. Landscape traps are zones caught in a vortex of spiralling degradation. Here, we demonstrate how short-range endemic (SRE) traits may make such taxa vulnerable to ecological and landscape traps. Three SRE species of mygalomorph spider were used in this study: Idiommata blackwalli, Idiosoma sigillatum and an undescribed Aganippe sp. Mygalomorphs can be long-lived (>43 years) and select sites for permanent burrows in their early dispersal phase. Spiderlings from two species, I. blackwalli (n = 20) and Aganippe sp. (n = 50), demonstrated choice for microhabitats under experimental conditions, that correspond to where adults typically occur in situ. An invasive veldt grass microhabitat was selected almost exclusively by spiderlings of I. sigillatum. At present, habitat dominated by veldt grass in Perth, Western Australia, has lower prey diversity and abundance than undisturbed habitats and therefore may act as an ecological trap for this species. Furthermore, as a homogenising force, veldt grass can spread to form a landscape trap in naturally heterogeneous ecosystems. Selection of specialised microhabitats of SREs may explain high extinction rates in old, stable landscapes undergoing (human-induced) rapid change. creator: Leanda D. Mason creator: Philip W. Bateman creator: Grant W. Wardell-Johnson uri: https://doi.org/10.7717/peerj.4715 license: http://creativecommons.org/licenses/by/4.0/ rights: © 2018 Mason et al. title: IL-6 and TNF-α salivary levels according to the periodontal status in Portuguese pregnant women link: https://peerj.com/articles/4710 last-modified: 2018-05-04 description: BackgroundPeriodontitis is associated with increased concentration of inflammatory markers and saliva has been proposed as a non-invasive diagnostic fluid in oral and systemic diseases. The levels of salivary biomarkers, such as cytokines, could potentially be used to distinguish periodontal healthy individuals from subjects with periodontal disease. The purpose of this study was to characterize the salivary levels of two inflammatory biomarkers associated with periodontitis, interleukin-6 (IL-6) and tumour necrosis factor-alpha (TNF-α), in order to assess whether these cytokines salivary levels could potentially be used to complement periodontitis pregnant women diagnose.MethodsForty-four pregnant women were distributed into three groups, according to their periodontal status: healthy, mild/moderate periodontitis and severe periodontitis. Unstimulated saliva was collected and analysis of TNF-α and IL-6 salivary levels were performed with Immulite®.ResultsWomen with periodontitis exhibited significantly higher levels (p = 0.001) of salivary IL-6 and TNF-α compared with the healthy group: 25.1 (±11.2) pg/mL vs. 16.3 (±5.0) pg/mL and 29.7 (±17.2) pg/mL vs. 16.2 (±7.6) pg/mL, approximately 1.5 and 1.8 times more, respectively. Additionally, cytokines were significantly increased (p < 0.05) in severe periodontitis compared to periodontal healthy pregnant women.ConclusionsThese results revealed that IL-6 and TNF-α salivary biomarkers provide high discriminatory capacity for distinguishing periodontal disease from periodontal health in pregnant women. creator: Vanessa Machado creator: Maria Fernanda Mesquita creator: Maria Alexandra Bernardo creator: Ester Casal creator: Luís Proença creator: José João Mendes uri: https://doi.org/10.7717/peerj.4710 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2018 Machado et al. title: DNA metabarcoding of littoral hard-bottom communities: high diversity and database gaps revealed by two molecular markers link: https://peerj.com/articles/4705 last-modified: 2018-05-04 description: Biodiversity assessment of marine hard-bottom communities is hindered by the high diversity and size-ranges of the organisms present. We developed a DNA metabarcoding protocol for biodiversity characterization of structurally complex natural marine hard-bottom communities. We used two molecular markers: the “Leray fragment” of mitochondrial cytochrome c oxidase (COI), for which a novel primer set was developed, and the V7 region of the nuclear small subunit ribosomal RNA (18S). Eight different shallow marine littoral communities from two National Parks in Spain (one in the Atlantic Ocean and another in the Mediterranean Sea) were studied. Samples were sieved into three size fractions from where DNA was extracted separately. Bayesian clustering was used for delimiting molecular operational taxonomic units (MOTUs) and custom reference databases were constructed for taxonomic assignment. Despite applying stringent filters, we found high values for MOTU richness (2,510 and 9,679 MOTUs with 18S and COI, respectively), suggesting that these communities host a large amount of yet undescribed eukaryotic biodiversity. Significant gaps are still found in sequence reference databases, which currently prevent the complete taxonomic assignment of the detected sequences. In our dataset, 85% of 18S MOTUs and 64% of COI MOTUs could be identified to phylum or lower taxonomic level. Nevertheless, those unassigned were mostly rare MOTUs with low numbers of reads, and assigned MOTUs comprised over 90% of the total sequence reads. The identification rate might be significantly improved in the future, as reference databases are further completed. Our results show that marine metabarcoding, currently applied mostly to plankton or sediments, can be adapted to structurally complex hard bottom samples. Thus, eukaryotic metabarcoding emerges as a robust, fast, objective and affordable method to comprehensively characterize the diversity of marine benthic communities dominated by macroscopic seaweeds and colonial or modular sessile metazoans. The 18S marker lacks species-level resolution and thus cannot be recommended to assess the detailed taxonomic composition of these communities. Our new universal primers for COI can potentially be used for biodiversity assessment with high taxonomic resolution in a wide array of marine, terrestrial or freshwater eukaryotic communities. creator: Owen S. Wangensteen creator: Creu Palacín creator: Magdalena Guardiola creator: Xavier Turon uri: https://doi.org/10.7717/peerj.4705 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2018 Wangensteen et al. title: Candidate genes in gastric cancer identified by constructing a weighted gene co-expression network link: https://peerj.com/articles/4692 last-modified: 2018-05-04 description: BackgroundGastric cancer (GC) is one of the most common cancers with high mortality globally. However, the molecular mechanisms of GC are unclear, and the prognosis of GC is poor. Therefore, it is important to explore the underlying mechanisms and screen for novel prognostic markers and treatment targets.MethodsThe genetic and clinical data of GC patients in The Cancer Genome Atlas (TCGA) was analyzed by weighted gene co-expression network analysis (WGCNA). Modules with clinical significance and preservation were distinguished, and gene ontology and pathway enrichment analysis were performed. Hub genes of these modules were validated in the TCGA dataset and another independent dataset from the Gene Expression Omnibus (GEO) database by t-test. Furthermore, the significance of these genes was confirmed via survival analysis.ResultsWe found a preserved module consisting of 506 genes was associated with clinical traits including pathologic T stage and histologic grade. PDGFRB, COL8A1, EFEMP2, FBN1, EMILIN1, FSTL1 and KIRREL were identified as candidate genes in the module. Their expression levels were correlated with pathologic T stage and histologic grade, also affected overall survival of GC patients.ConclusionThese candidate genes may be involved in proliferation and differentiation of GC cells. They may serve as novel prognostic markers and treatment targets. Moreover, most of them were first reported in GC and deserved further research. creator: Jian Chen creator: Xiuwen Wang creator: Bing Hu creator: Yifu He creator: Xiaojun Qian creator: Wei Wang uri: https://doi.org/10.7717/peerj.4692 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2018 Chen et al. title: Bioinformatic analysis and identification of potential prognostic microRNAs and mRNAs in thyroid cancer link: https://peerj.com/articles/4674 last-modified: 2018-05-04 description: Thyroid cancer is one of the most common endocrine malignancies. Multiple evidences revealed that a large number of microRNAs and mRNAs were abnormally expressed in thyroid cancer tissues. These microRNAs and mRNAs play important roles in tumorigenesis. In the present study, we identified 72 microRNAs and 1,766 mRNAs differentially expressed between thyroid cancer tissues and normal thyroid tissues and evaluated their prognostic values using Kaplan-Meier survival curves by log-rank test. Seven microRNAs (miR-146b, miR-184, miR-767, miR-6730, miR-6860, miR-196a-2 and miR-509-3) were associated with the overall survival. Among them, three microRNAs were linked with six differentially expressed mRNAs (miR-767 was predicted to target COL10A1, PLAG1 and PPP1R1C; miR-146b was predicted to target MMP16; miR-196a-2 was predicted to target SYT9). To identify the key genes in the protein-protein interaction network , we screened out the top 10 hub genes (NPY, NMU, KNG1, LPAR5, CCR3, SST, PPY, GABBR2, ADCY8 and SAA1) with higher degrees. Only LPAR5 was associated with the overall survival. Multivariate analysis demonstrated that miR-184, miR-146b, miR-509-3 and LPAR5 were an independent risk factors for prognosis. Our results of the present study identified a series of prognostic microRNAs and mRNAs that have the potential to be the targets for treatment of thyroid cancer. creator: Jianing Tang creator: Deguang Kong creator: Qiuxia Cui creator: Kun Wang creator: Dan Zhang creator: Qianqian Yuan creator: Xing Liao creator: Yan Gong creator: Gaosong Wu uri: https://doi.org/10.7717/peerj.4674 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2018 Tang et al. title: Candidate genes associated with red colour formation revealed by comparative genomic variant analysis of red- and green-skinned fruits of Japanese apricot (Prunus mume) link: https://peerj.com/articles/4625 last-modified: 2018-05-04 description: The red-skinned fruit of Japanese apricot (Prunus mume Sieb. et Zucc) appeals to customers due to its eye-catching pigmentation, while the mechanism related to its colour formation is still unclear. In this study, genome re-sequencing of six Japanese apricot cultivars was carried out with approximately 92.2 Gb of clean bases using next-generation sequencing. A total of 32,004 unigenes were assembled with an average of 83.1% coverage rate relative to reference genome. A wide range of genetic variation was detected, including 7,387,057 single nucleotide polymorphisms, 456,222 insertions or deletions and 129,061 structural variations in all genomes. Comparative sequencing data revealed that 13 candidate genes were involved in biosynthesis of anthocyanin. Significantly higher expression patterns were observed in genes encoding three anthocyanin synthesis structural genes (4CL, F3H and UFGT), five transcription factors (MYB–bHLH–WD40 complexes and NAC) and five anthocyanin accumulation related genes (GST1, RT1, UGT85A2, ABC and MATE transporters) in red-skinned than in green-skinned Japanese apricots using reverse transcription-quantitative polymerase chain reaction. Eight main kinds of anthocyanin s were detected by UPLC/MS, and cyanidin 3-glucoside was identified as the major anthocyanin (124.2 mg/kg) in red-skinned cultivars. The activity of UDP-glucose flavonoid-3-O-glycosyltransferase enzyme determined by UPLC was significantly higher in all red-skinned cultivars, suggesting that it is the potential vital regulatory gene for biosynthesis of anthocyanin in Japanese apricot. creator: Xiaopeng Ni creator: Song Xue creator: Shahid Iqbal creator: Wanxu Wang creator: Zhaojun Ni creator: Muhammad Khalil-ur-Rehman creator: Zhihong Gao uri: https://doi.org/10.7717/peerj.4625 license: http://creativecommons.org/licenses/by/4.0/ rights: © 2018 Ni et al. title: Quantitative estimation of soil salinity by means of different modeling methods and visible-near infrared (VIS–NIR) spectroscopy, Ebinur Lake Wetland, Northwest China link: https://peerj.com/articles/4703 last-modified: 2018-05-03 description: Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS–NIR) spectroscopy. The soil samples (n = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0–2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R2 (0.93), RMSE (4.57 dS m−1), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity. creator: Jingzhe Wang creator: Jianli Ding creator: Aerzuna Abulimiti creator: Lianghong Cai uri: https://doi.org/10.7717/peerj.4703 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2018 Wang et al. title: Post-traumatic stress symptoms are associated with better performance on a delayed match-to-position task link: https://peerj.com/articles/4701 last-modified: 2018-05-03 description: Many individuals with posttraumatic stress disorder (PTSD) report experiencing frequent intrusive memories of the original traumatic event (e.g., flashbacks). These memories can be triggered by situations or stimuli that reflect aspects of the trauma and may reflect basic processes in learning and memory, such as generalization. It is possible that, through increased generalization, non-threatening stimuli that once evoked normal memories become associated with traumatic memories. Previous research has reported increased generalization in PTSD, but the role of visual discrimination processes has not been examined. To investigate visual discrimination in PTSD, 143 participants (Veterans and civilians) self-assessed for symptom severity were grouped according to the presence of severe PTSD symptoms (PTSS) vs. few/no symptoms (noPTSS). Participants were given a visual match-to-sample pattern separation task that varied trials by spatial separation (Low, Medium, High) and temporal delays (5, 10, 20, 30 s). Unexpectedly, the PTSS group demonstrated better discrimination performance than the noPTSS group at the most difficult spatial trials (Low spatial separation). Further assessment of accuracy and reaction time using diffusion drift modeling indicated that the better performance by the PTSS group on the hardest trials was not explained by slower reaction times, but rather a faster accumulation of evidence during decision making in conjunction with a reduced threshold, indicating a tendency in the PTSS group to decide quickly rather than waiting for additional evidence to support the decision. This result supports the need for future studies examining the precise role of discrimination and generalization in PTSD, and how these cognitive processes might contribute to expression and maintenance of PTSD symptoms. creator: Meghan D. Caulfield creator: Catherine E. Myers uri: https://doi.org/10.7717/peerj.4701 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2018 Caulfield and Myers