title: PeerJ description: Articles published in PeerJ link: https://peerj.com/articles/index.rss3?journal=peerj&page=1351 creator: info@peerj.com PeerJ errorsTo: info@peerj.com PeerJ language: en title: Performance evaluation of deep neural ensembles toward malaria parasite detection in thin-blood smear images link: https://peerj.com/articles/6977 last-modified: 2019-05-28 description: BackgroundMalaria is a life-threatening disease caused by Plasmodium parasites that infect the red blood cells (RBCs). Manual identification and counting of parasitized cells in microscopic thick/thin-film blood examination remains the common, but burdensome method for disease diagnosis. Its diagnostic accuracy is adversely impacted by inter/intra-observer variability, particularly in large-scale screening under resource-constrained settings.IntroductionState-of-the-art computer-aided diagnostic tools based on data-driven deep learning algorithms like convolutional neural network (CNN) has become the architecture of choice for image recognition tasks. However, CNNs suffer from high variance and may overfit due to their sensitivity to training data fluctuations.ObjectiveThe primary aim of this study is to reduce model variance, improve robustness and generalization through constructing model ensembles toward detecting parasitized cells in thin-blood smear images.MethodsWe evaluate the performance of custom and pretrained CNNs and construct an optimal model ensemble toward the challenge of classifying parasitized and normal cells in thin-blood smear images. Cross-validation studies are performed at the patient level to ensure preventing data leakage into the validation and reduce generalization errors. The models are evaluated in terms of the following performance metrics: (a) Accuracy; (b) Area under the receiver operating characteristic (ROC) curve (AUC); (c) Mean squared error (MSE); (d) Precision; (e) F-score; and (f) Matthews Correlation Coefficient (MCC).ResultsIt is observed that the ensemble model constructed with VGG-19 and SqueezeNet outperformed the state-of-the-art in several performance metrics toward classifying the parasitized and uninfected cells to aid in improved disease screening.ConclusionsEnsemble learning reduces the model variance by optimally combining the predictions of multiple models and decreases the sensitivity to the specifics of training data and selection of training algorithms. The performance of the model ensemble simulates real-world conditions with reduced variance, overfitting and leads to improved generalization. creator: Sivaramakrishnan Rajaraman creator: Stefan Jaeger creator: Sameer K. Antani uri: https://doi.org/10.7717/peerj.6977 license: http://creativecommons.org/publicdomain/zero/1.0/ rights: title: Functional analysis of lncRNAs based on competitive endogenous RNA in tongue squamous cell carcinoma link: https://peerj.com/articles/6991 last-modified: 2019-05-28 description: BackroundTongue squamous cell carcinoma (TSCC) is the most common malignant tumor in the oral cavity. An increasing number of studies have suggested that long noncoding RNA (lncRNA) plays an important role in the biological process of disease and is closely related to the occurrence and development of disease, including TSCC. Although many lncRNAs have been discovered, there remains a lack of research on the function and mechanism of lncRNAs. To better understand the clinical role and biological function of lncRNAs in TSCC, we conducted this study.MethodsIn this study, 162 tongue samples, including 147 TSCC samples and 15 normal control samples, were investigated and downloaded from The Cancer Genome Atlas (TCGA). We constructed a competitive endogenous RNA (ceRNA) regulatory network. Then, we investigated two lncRNAs as key lncRNAs using Kaplan–Meier curve analysis and constructed a key lncRNA-miRNA-mRNA subnetwork. Furthermore, gene set enrichment analysis (GSEA) was carried out on mRNAs in the subnetwork after multivariate survival analysis of the Cox proportional hazards regression model.ResultsThe ceRNA regulatory network consists of six differentially expressed miRNAs (DEmiRNAs), 29 differentially expressed lncRNAs (DElncRNAs) and six differentially expressed mRNAs (DEmRNAs). Kaplan-Meier curve analysis of lncRNAs in the TSCC ceRNA regulatory network showed that only two lncRNAs, including LINC00261 and PART1, are correlated with the total survival time of TSCC patients. After we constructed the key lncRNA-miRNA -RNA sub network, the GSEA results showed that key lncRNA are mainly related to cytokines and the immune system. High expression levels of LINC00261 indicate a poor prognosis, while a high expression level of PART1 indicates a better prognosis. creator: Yidan Song creator: Yihua Pan creator: Jun Liu uri: https://doi.org/10.7717/peerj.6991 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 Song et al. title: Fisher linear discriminant analysis for classification and prediction of genomic susceptibility to stomach and colorectal cancers based on six STR loci in a northern Chinese Han population link: https://peerj.com/articles/7004 last-modified: 2019-05-28 description: ObjectiveGastrointestinal cancer is the leading cause of cancer-related death worldwide. The aim of this study was to verify whether the genotype of six short tandem repeat (STR) loci including AR, Bat-25, D5S346, ER1, ER2, and FGA is associated with the risk of gastric cancer (GC) and colorectal cancer (CRC) and to develop a model that allows early diagnosis and prediction of inherited genomic susceptibility to GC and CRC.MethodsAlleles of six STR loci were determined using the peripheral blood of six colon cancer patients, five rectal cancer patients, eight GC patients, and 30 healthy controls. Fisher linear discriminant analysis (FDA) was used to establish the discriminant formula to distinguish GC and CRC patients from healthy controls. Leave-one-out cross validation and receiver operating characteristic (ROC) curves were used to validate the accuracy of the formula. The relationship between the STR status and immunohistochemical (IHC) and tumor markers was analyzed using multiple correspondence analysis.ResultsD5S346 was confirmed as a GC- and CRC-related STR locus. For the first time, we established a discriminant formula on the basis of the six STR loci, which was used to estimate the risk coefficient of suffering from GC and CRC. The model was statistically significant (Wilks’ lambda = 0.471, χ2 = 30.488, df = 13, and p = 0.004). The results of leave-one-out cross validation showed that the sensitivity of the formula was 73.7% and the specificity was 76.7%. The area under the ROC curve (AUC) was 0.926, with a sensitivity of 73.7% and a specificity of 93.3%. The STR status was shown to have a certain relationship with the expression of some IHC markers and the level of some tumor markers.ConclusionsThe results of this study complement clinical diagnostic criteria and present markers for early prediction of GC and CRC. This approach will aid in improving risk awareness of susceptible individuals and contribute to reducing the incidence of GC and CRC by prevention and early detection. creator: Shuhong Hao creator: Ming Ren creator: Dong Li creator: Yujie Sui creator: Qingyu Wang creator: Gaoyang Chen creator: Zhaoyan Li creator: Qiwei Yang uri: https://doi.org/10.7717/peerj.7004 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 Hao et al. title: Identify CRNDE and LINC00152 as the key lncRNAs in age-related degeneration of articular cartilage through comprehensive and integrative analysis link: https://peerj.com/articles/7024 last-modified: 2019-05-28 description: BackgroundOsteoarthritis (OA) is one of the most important age-related degenerative diseases, and the leading cause of disability and chronic pain in the aging population. Recent studies have identified several lncRNA-associated functions involved in the development of OA. Because age is a key risk factor for OA, we investigated the differential expression of age-related lncRNAs in each stage of OA.MethodsTwo gene expression profiles were downloaded from the GEO database and differentially expressed genes (DEGs) were identified across each of the different developmental stages of OA. Next, gene ontology (GO) functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed to annotate the function of the DEGs. Finally, a lncRNA-targeted DEG network was used to identify hub-lncRNAs.ResultsA total of 174 age-related DEGs were identified. GO analyses confirmed that age-related degradation was strongly associated with cell adhesion, endodermal cell differentiation and collagen fibril organization. Significantly enriched KEGG pathways associated with these DEGs included the PI3K–Akt signaling pathway, focal adhesion, and ECM–receptor interaction. Further analyses via a protein–protein interaction (PPI) network identified two hub lncRNAs, CRNDE and LINC00152, involved in the process of age-related degeneration of articular cartilage. Our findings suggest that lncRNAs may play active roles in the development of OA. Investigation of the gene expression profiles in different development stages may supply a new target for OA treatment. creator: Pengfei Hu creator: Fangfang Sun creator: Jisheng Ran creator: Lidong Wu uri: https://doi.org/10.7717/peerj.7024 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 Hu et al. title: Returning a lost process by reintroducing a locally extinct digging marsupial link: https://peerj.com/articles/6622 last-modified: 2019-05-27 description: The eastern bettong (Bettongia gaimardi), a medium-sized digging marsupial, was reintroduced to a predator-free reserve after 100 years of absence from the Australian mainland. The bettong may have the potential to restore temperate woodlands degraded by a history of livestock grazing, by creating numerous small disturbances by digging. We investigated the digging capacity of the bettong and compared this to extant fauna, to answer the first key question of whether this species could be considered an ecosystem engineer, and ultimately if it has the capacity to restore lost ecological processes. We found that eastern bettongs were frequent diggers and, at a density of 0.3–0.4 animals ha−1, accounted for over half the total foraging pits observed (55%), with echidnas (Tachyglossus aculeatus), birds and feral rabbits (Oryctolagus cuniculus) accounting for the rest. We estimated that the population of bettongs present dug 985 kg of soil per ha per year in our study area. Bettongs dug more where available phosphorus was higher, where there was greater basal area of Acacia spp. and where kangaroo grazing was less. There was no effect on digging of eucalypt stem density or volume of logs on the ground. While bettong digging activity was more frequent under trees, digging also occurred in open grassland, and bettongs were the only species observed to dig in scalds (areas where topsoil has eroded to the B Horizon). These results highlight the potential for bettongs to enhance soil processes in a way not demonstrated by the existing fauna (native birds and echidna), and introduced rabbit. creator: Nicola T. Munro creator: Sue McIntyre creator: Ben Macdonald creator: Saul A. Cunningham creator: Iain J. Gordon creator: Ross B. Cunningham creator: Adrian D. Manning uri: https://doi.org/10.7717/peerj.6622 license: http://creativecommons.org/licenses/by/4.0/ rights: © 2019 Munro et al. title: Hierarchical generalized additive models in ecology: an introduction with mgcv link: https://peerj.com/articles/6876 last-modified: 2019-05-27 description: In this paper, we discuss an extension to two popular approaches to modeling complex structures in ecological data: the generalized additive model (GAM) and the hierarchical model (HGLM). The hierarchical GAM (HGAM), allows modeling of nonlinear functional relationships between covariates and outcomes where the shape of the function itself varies between different grouping levels. We describe the theoretical connection between HGAMs, HGLMs, and GAMs, explain how to model different assumptions about the degree of intergroup variability in functional response, and show how HGAMs can be readily fitted using existing GAM software, the mgcv package in R. We also discuss computational and statistical issues with fitting these models, and demonstrate how to fit HGAMs on example data. All code and data used to generate this paper are available at: github.com/eric-pedersen/mixed-effect-gams. creator: Eric J. Pedersen creator: David L. Miller creator: Gavin L. Simpson creator: Noam Ross uri: https://doi.org/10.7717/peerj.6876 license: http://creativecommons.org/licenses/by/4.0/ rights: © 2019 Pedersen et al. title: Detection of condition-specific marker genes from RNA-seq data with MGFR link: https://peerj.com/articles/6970 last-modified: 2019-05-27 description: The identification of condition-specific genes is key to advancing our understanding of cell fate decisions and disease development. Differential gene expression analysis (DGEA) has been the standard tool for this task. However, the amount of samples that modern transcriptomic technologies allow us to study, makes DGEA a daunting task. On the other hand, experiments with low numbers of replicates lack the statistical power to detect differentially expressed genes. We have previously developed MGFM, a tool for marker gene detection from microarrays, that is particularly useful in the latter case. Here, we have adapted the algorithm behind MGFM to detect markers in RNA-seq data. MGFR groups samples with similar gene expression levels and flags potential markers of a sample type if their highest expression values represent all replicates of this type. We have benchmarked MGFR against other methods and found that its proposed markers accurately characterize the functional identity of different tissues and cell types in standard and single cell RNA-seq datasets. Then, we performed a more detailed analysis for three of these datasets, which profile the transcriptomes of different human tissues, immune and human blastocyst cell types, respectively. MGFR’s predicted markers were compared to gold-standard lists for these datasets and outperformed the other marker detectors. Finally, we suggest novel candidate marker genes for the examined tissues and cell types. MGFR is implemented as a freely available Bioconductor package (https://doi.org/doi:10.18129/B9.bioc.MGFR), which facilitates its use and integration with bioinformatics pipelines. creator: Khadija El Amrani creator: Gregorio Alanis-Lobato creator: Nancy Mah creator: Andreas Kurtz creator: Miguel A. Andrade-Navarro uri: https://doi.org/10.7717/peerj.6970 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 El Amrani et al. title: Enhanced mechanical, thermal and biocompatible nature of dual component electrospun nanocomposite for bone tissue engineering link: https://peerj.com/articles/6986 last-modified: 2019-05-27 description: Traditionally, in the Asian continent, oils are a widely accepted choice for alleviating bone-related disorders. The design of scaffolds resembling the extracellular matrix (ECM) is of great significance in bone tissue engineering. In this study, a multicomponent polyurethane (PU), canola oil (CO) and neem oil (NO) scaffold was developed using the electrospinning technique. The fabricated nanofibers were subjected to various physicochemical and biological testing to validate its suitability for bone tissue engineering. Morphological analysis of the multicomponent scaffold showed a reduction in fiber diameter (PU/CO—853 ± 141.27 nm and PU/CO/NO—633 ± 137.54 nm) compared to PU (890 ± 116.911 nm). The existence of CO and NO in PU matrix was confirmed by an infrared spectrum (IR) with the formation of hydrogen bond. PU/CO displayed a mean contact angle of 108.7° ± 0.58 while the PU/CO/NO exhibited hydrophilic nature with an angle of 62.33° ± 2.52. The developed multicomponent also exhibited higher thermal stability and increased mechanical strength compared to the pristine PU. Atomic force microscopy (AFM) analysis depicted lower surface roughness for the nanocomposites (PU/CO—389 nm and PU/CO/NO—323 nm) than the pristine PU (576 nm). Blood compatibility investigation displayed the anticoagulant nature of the composites. Cytocompatibility studies revealed the non-toxic nature of the developed composites with human fibroblast cells (HDF) cells. The newly developed porous PU nanocomposite scaffold comprising CO and NO may serve as a potential candidate for bone tissue engineering. creator: Guanbao Li creator: Pinquan Li creator: Qiuan Chen creator: Mohan Prasath Mani creator: Saravana Kumar Jaganathan uri: https://doi.org/10.7717/peerj.6986 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 Li et al. title: Effectiveness of nasal irrigation devices: a Thai multicentre survey link: https://peerj.com/articles/7000 last-modified: 2019-05-27 description: BackgroundNasal irrigation is widely used as an adjunctive treatment for nasal diseases. There is little evidence regarding the efficacy of the devices used in this procedure. The objective of this survey was to evaluate the effectiveness of nasal irrigation devices based on the experiences of patients and physicians.MethodsWe conducted a multicentre survey study between November 2017 and October 2018. Physician and patient questionnaires were developed based on the available literature and expert opinion. The physician questionnaire was submitted to the Otolaryngology residents and staff of each centre and their network. The physicians were also asked to distribute the patient questionnaire to their patients.ResultsInformation regarding 331 devices used by the patients was collected. The mean age of the patients was 45.46 ± 17.19 years (from 5 to 81). Roughly half were male, and half were female (48.6%: 51.4%). Among the high-pressure devices, we found that the high-pressure large-volume nasal irrigation devices yielded significantly higher symptom scores in seven of 12 domains (p < 0.05). Among the large-volume devices, we found that the large-volume high-pressure nasal irrigation devices received significantly higher symptom scores in 4 of 12 domains (p < 0.05). However, a higher proportion of patients using the large-volume high-pressure devices had retained fluid in the sinuses compared to those using large-volume low-pressure devices (p < 0.001).ConclusionsThis survey supports the regular use of nasal irrigation, particularly with large-volume high-pressure devices, as an effective treatment for nasal disease. It may be effective at clearing nasal secretion, improve nasal congestion, decrease post-nasal drip, improve sinus pain or headache, improve taste and smell, and improve sleep quality. It could be used by patients with good compliance and minimal side effects. creator: Patorn Piromchai creator: Charoiboon Puvatanond creator: Virat Kirtsreesakul creator: Saisawat Chaiyasate creator: Sanguansak Thanaviratananich uri: https://doi.org/10.7717/peerj.7000 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 Piromchai et al. title: On the presence of Dipturus nidarosiensis (Storm, 1881) in the Central Mediterranean area link: https://peerj.com/articles/7009 last-modified: 2019-05-27 description: The Norwegian skate Dipturus nidarosiensis (Storm, 1881) has only recently been recorded in the western Mediterranean Sea along the coast of southern Sardinia, off Algeria and the Alboran Sea. The present study confirmed the presence of the species in the Central Mediterranean Sea by identifying morphometric, morphological features and molecular markers. Biological sampling was conducted from 2010 to 2016 on eight specimens collected through commercial landings, offshore observer programmes and scientific surveys in Adriatic and Ionian waters at depths between 320 and 720 m. The total lengths of the specimens (juveniles and adults) ranged from 268 to 1,422 mm, and their body weights ranged from 44.5 to 12,540.0 g. They showed morphometric features that corresponded to those of Norwegian skates in the Northeast Atlantic and the Western Mediterranean. In previous analyses, molecular data were obtained by mitochondrial COI sequences. The haplotype network showed the occurrence of a common haplotype (Hap_1) shared by the individuals from areas in the North Atlantic, Sardinian, Algerian and Spanish Mediterranean Sea areas but not South Africa. The occurrence of individuals in different stages of life (i.e., juveniles, sub-adults and adults) and sexual development (immature and mature) suggested the presence of a species with a permanent reproductive allocation in the deep waters of the Mediterranean, which was exposed to a low level of fishing exploitation. Indeed, the deep depth distribution of the species could be the reason for the absence of information about this species in onshore or offshore fishery data collection programmes and scientific surveys. creator: Pierluigi Carbonara creator: Rita Cannas creator: Marilena Donnaloia creator: Riccardo Melis creator: Cristina Porcu creator: Maria Teresa Spedicato creator: Walter Zupa creator: Maria Cristina Follesa uri: https://doi.org/10.7717/peerj.7009 license: http://creativecommons.org/licenses/by/4.0/ rights: ©2019 Carbonara et al.