PeerJ:Statistics

A geometric morphometric protocol to correct postmortem body arching in fossil fishes

2024-05-31

Postmortem body curvature introduces error in fish morphometric data. Compared to living fish, the causes of such body curvature in fossils may be due to additive taphonomic processes that have been widely studied. However, a protocol that helps to correct its effect upon morphometric data remains unexplored. Here, we test two different mathematical approaches (multivariate regression and the so-called ‘unbending functions’) available to tackle fish geometric morphometric data in two exceptionally preserved gonorynchiformes fossil fishes, Rubiesichthys gregalis and Gordichthys conquensis, from the Las Hoyas deposits (Early Cretaceous, Spain). Although both methods successfully correct body curvature (i.e., removing misleading geometric variation), our results show that traditional approaches applied in living fishes might not be appropriate to fossil ones, because of the additional anatomical alterations. Namely, the best result for 2D fossil fishes is achieved by correcting the arching of the specimens (mathematically “unbending” them). Ultimately, the effect of body curvature on morphometric data is largely taxon independent and morphological diversity mitigates its effect, but size is an important factor to take into account (because larger individuals tend to be less curved).

Anti-BIRC5 autoantibody serves as a valuable biomarker for diagnosing AFP-negative hepatocellular carcinoma

2024-05-31

Background Autoantibodies targeting tumor-associated antigens (TAAbs) have emerged as promising biomarkers for early cancer detection. This research aimed to assess the diagnostic capacity of anti-BIRC5 autoantibody in detecting AFP-negative hepatocellular carcinoma (ANHCC). Methods This research was carried out in three stages (discovery phase, validation phase, and evaluation phase) and included a total of 744 participants. Firstly, the anti-BIRC5 autoantibody was discovered using protein microarray, exhibiting a higher positive rate in ANHCC samples (ANHCCs) compared to normal control samples (NCs). Secondly, the anti-BIRC5 autoantibody was validated through enzyme-linked immunosorbent assay (ELISA) in 85 ANHCCs and 85 NCs from two clinical centers (Zhengzhou and Nanchang). Lastly, the diagnostic usefulness of the anti-BIRC5 autoantibody for hepatocellular carcinoma (HCC) was evaluated by ELISA in a cohort consisting of an additional 149 AFP-positive hepatocellular carcinoma samples (APHCCs), 95 ANHCCs and 244 NCs. The association of elevated autoantibody to high expression of BIRC5 in HCC was further explored by the database from prognosis, immune infiltration, DNA methylation, and gene mutation level. Results In the validation phase, the area under the ROC curve (AUC) of anti-BIRC5 autoantibody to distinguish ANHCCs from NCs in Zhengzhou and Nanchang centers was 0.733 and 0.745, respectively. In the evaluation phase, the AUCs of anti-BIRC5 autoantibody for identifying ANHCCs and HCCs from NCs were 0.738 and 0.726, respectively. Furthermore, when combined with AFP, the AUC for identifying HCCs from NCs increased to 0.914 with a sensitivity of 77.5% and specificity of 91.8%. High expression of BIRC5 gene is not only correlated with poor prognosis of HCCs, but also significantly associated with infiltration of immune cells, DNA methylation, and gene mutation. Conclusion The findings suggest that the anti-BIRC5 autoantibody could serve as a potential biomarker for ANHCC, in addition to its supplementary role alongside AFP in the diagnosis of HCC. Next, we can carry out specific verification and explore the function of anti-BIRC5 autoantibody in the occurrence and development of HCC.

Prognostic and chemotherapeutic implications of a novel four-gene pyroptosis model in head and neck squamous cell carcinoma

2024-05-13

Background Head and neck squamous cell carcinoma (HNSCC) is one of the most common cancers. Chemotherapy remains one dominant therapeutic strategy, while a substantial proportion of patients may develop chemotherapeutic resistance; therefore, it is particularly significant to identify the patients who could achieve maximum benefits from chemotherapy. Presently, four pyroptosis genes are reported to correlate with the chemotherapeutic response or prognosis of HNSCC, while no study has assessed the combinatorial predicting efficacy of these four genes. Hence, this study aims to evaluate the predictive value of a multi-gene pyroptosis model regarding the prognosis and chemotherapeutic responsiveness in HNSCC. Methods By utilizing RNA-sequencing data from The Cancer Genome Atlas database and the Gene Expression Omnibus database, the pyroptosis-related gene score (PRGscore) was computed for each HNSCC sample by performing a Gene Set Variation Analysis (GSVA) based on four genes (Caspase-1, Caspase-3, Gasdermin D, Gasdermin E). The prognostic significance of the PRGscore was assessed through Cox regression and Kaplan–Meier survival analyses. Additionally, chemotherapy sensitivity stratified by high and low PRGscore was examined to determine the potential association between pyroptosis activity and chemosensitivity. Furthermore, chemotherapy sensitivity assays were conducted in HNSCC cell lines in vitro. Results As a result, our study successfully formulated a PRGscore reflective of pyroptotic activity in HNSCC. Higher PRGscore correlates with worse prognosis. However, patients with higher PRGscore were remarkably more responsive to chemotherapy. In agreement, chemotherapy sensitivity tests on HNSCC cell lines indicated a positive association between overall pyroptosis levels and chemosensitivity to cisplatin and 5-fluorouracil; in addition, patients with higher PRGscore may benefit from the immunotherapy. Overall, our study suggests that HNSCC patients with higher PRGscore, though may have a less favorable prognosis, chemotherapy and immunotherapy may exhibit better benefits in this population.

Kernel density estimation of allele frequency including undetected alleles

2024-04-22

Whereas undetected species contribute to estimation of species diversity, undetected alleles have not been used to estimated genetic diversity. Although random sampling guarantees unbiased estimation of allele frequency and genetic diversity measures, using undetected alleles may provide biased but more precise estimators useful for conservation. We newly devised kernel density estimation (KDE) for allele frequency including undetected alleles and tested it in estimation of allele frequency and nucleotide diversity using population generated by coalescent simulation as well as well as real population data. Contrary to expectations, nucleotide diversity estimated by KDE had worse bias and accuracy. Allele frequency estimated by KDE was also worse except when the sample size was small. These might be due to finity of population and/or the curse of dimensionality. In conclusion, KDE of allele frequency does not contribute to genetic diversity estimation.

Combined detection of serum IL-6 and CEA contributes to the diagnosis of lung adenocarcinoma in situ

2024-03-22

Background Effective discrimination of lung adenocarcinoma (LUAD) in situ (AIS) from benign pulmonary nodules (BPN) is critical for the early diagnosis of AIS. Our pilot study in a small cohort of 90 serum samples has shown that serum interleukin 6 (IL-6) detection can distinguish AIS from BPN and health controls (HC). In this study, we intend to comprehensively define the diagnostic value of individual and combined detection of serum IL-6 related to the traditional tumor markers carcinoembryonic antigen (CEA) and cytokeratin 19 fragment (CYFRA21-1) for AIS. Methods The diagnostic performance of serum IL-6 along with CEA and CYFRA21-1 were evaluated in a large cohort of 300 serum samples by a chemiluminescence immunoassay and an electrochemiluminescence immunoassay. A training set comprised of 65 AIS, 65 BPN, and 65 HC samples was used to develop the predictive model for AIS. Data obtained from an independent validation set was applied to evaluate and validate the predictive model. Results In the training set, the levels of serum IL-6 and CEA in the AIS group were significantly higher than those in the BPN/HC group (P < 0.05). There was no significant difference in serum CYFRA21-1 levels between the AIS group and the BPN/HC group (P> 0.05). Serum IL-6 and CEA levels for AIS patients showed an area under the curve (AUC) of 0.622 with 23.1% sensitivity at 90.7% specificity, and an AUC of 0.672 with 24.6% sensitivity at 97.6% specificity, respectively. The combination of serum IL-6 and CEA presented an AUC of 0.739, with 60.0% sensitivity at 95.4% specificity. The combination of serum IL-6 and CEA showed an AUC of 0.767 for AIS patients, with 57.1% sensitivity at 91.4% specificity in the validation set. Conclusions IL-6 shows potential as a prospective serum biomarker for the diagnosis of AIS, and the combination of serum IL-6 with CEA may contribute to increased accuracy in AIS diagnosis. However, it is worth noting that further research is still necessary to validate and optimize the diagnostic efficacy of these biomarkers and to address potential sensitivity limitations.

Estimation of the percentile of Birnbaum-Saunders distribution and its application to PM2.5 in Northern Thailand

2024-02-29

The Birnbaum-Saunders distribution plays a crucial role in statistical analysis, serving as a model for failure time distribution in engineering and the distribution of particulate matter 2.5 (PM2.5) in environmental sciences. When assessing the health risks linked to PM2.5, it is crucial to give significant weight to percentile values, particularly focusing on lower percentiles, as they offer a more precise depiction of exposure levels and potential health hazards for the population. Mean and variance metrics may not fully encapsulate the comprehensive spectrum of risks connected to PM2.5 exposure. Various approaches, including the generalized confidence interval (GCI) approach, the bootstrap approach, the Bayesian approach, and the highest posterior density (HPD) approach, were employed to establish confidence intervals for the percentile of the Birnbaum-Saunders distribution. To assess the performance of these intervals, Monte Carlo simulations were conducted, evaluating them based on coverage probability and average length. The results demonstrate that the GCI approach is a favorable choice for estimating percentile confidence intervals. In conclusion, this article presents the results of the simulation study and showcases the practical application of these findings in the field of environmental sciences.

Does it pay to pay? A comparison of the benefits of open-access publishing across various sub-fields in biology

2024-02-27

Authors are often faced with the decision of whether to maximize traditional impact metrics or minimize costs when choosing where to publish the results of their research. Many subscription-based journals now offer the option of paying an article processing charge (APC) to make their work open. Though such “hybrid” journals make research more accessible to readers, their APCs often come with high price tags and can exclude authors who lack the capacity to pay to make their research accessible. Here, we tested if paying to publish open access in a subscription-based journal benefited authors by conferring more citations relative to closed access articles. We identified 146,415 articles published in 152 hybrid journals in the field of biology from 2013–2018 to compare the number of citations between various types of open access and closed access articles. In a simple generalized linear model analysis of our full dataset, we found that publishing open access in hybrid journals that offer the option confers an average citation advantage to authors of 17.8 citations compared to closed access articles in similar journals. After taking into account the number of authors, Journal Citation Reports 2020 Quartile, year of publication, and Web of Science category, we still found that open access generated significantly more citations than closed access (p < 0.0001). However, results were complex, with exact differences in citation rates among access types impacted by these other variables. This citation advantage based on access type was even similar when comparing open and closed access articles published in the same issue of a journal (p < 0.0001). However, by examining articles where the authors paid an article processing charge, we found that cost itself was not predictive of citation rates (p = 0.14). Based on our findings of access type and other model parameters, we suggest that, in the case of the 152 journals we analyzed, paying for open access does confer a citation advantage. For authors with limited budgets, we recommend pursuing open access alternatives that do not require paying a fee as they still yielded more citations than closed access. For authors who are considering where to submit their next article, we offer additional suggestions on how to balance exposure via citations with publishing costs.

How to account for behavioral states in step-selection analysis: a model comparison

2024-02-26

Step-selection models are widely used to study animals’ fine-scale habitat selection based on movement data. Resource preferences and movement patterns, however, often depend on the animal’s unobserved behavioral states, such as resting or foraging. As this is ignored in standard (integrated) step-selection analyses (SSA, iSSA), different approaches have emerged to account for such states in the analysis. The performance of these approaches and the consequences of ignoring the states in step-selection analysis, however, have rarely been quantified. We evaluate the recent idea of combining iSSAs with hidden Markov models (HMMs), which allows for a joint estimation of the unobserved behavioral states and the associated state-dependent habitat selection. Besides theoretical considerations, we use an extensive simulation study and a case study on fine-scale interactions of simultaneously tracked bank voles (Myodes glareolus) to compare this HMM-iSSA empirically to both the standard and a widely used classification-based iSSA (i.e., a two-step approach based on a separate prior state classification). Moreover, to facilitate its use, we implemented the basic HMM-iSSA approach in the R package HMMiSSA available on GitHub.

Mathematical model of voluntary vaccination against schistosomiasis

2024-02-07

Human schistosomiasis is a chronic and debilitating neglected tropical disease caused by parasitic worms of the genus Schistosoma. It is endemic in many countries in sub-Saharan Africa. Although there is currently no vaccine available, vaccines are in development. In this paper, we extend a simple compartmental model of schistosomiasis transmission by incorporating the vaccination option. Unlike previous models of schistosomiasis transmission that focus on control and treatment at the population level, our model focuses on incorporating human behavior and voluntary individual vaccination. We identify vaccination rates needed to achieve herd immunity as well as optimal voluntary vaccination rates. We demonstrate that the prevalence remains too high (higher than 1%) unless the vaccination costs are sufficiently low. Thus, we can conclude that voluntary vaccination (with or without mass drug administration) may not be sufficient to eliminate schistosomiasis as a public health concern. The cost of the vaccine (relative to the cost of schistosomiasis infection) is the most important factor determining whether voluntary vaccination can yield elimination of schistosomiasis. When the cost is low, the optimal voluntary vaccination rate is high enough that the prevalence of schistosomiasis declines under 1%. Once the vaccine becomes available for public use, it will be crucial to ensure that the individuals have as cheap an access to the vaccine as possible.

Prevalence of type 2 diabetes mellitus and impaired fasting glucose, and their associated lifestyle factors among teachers in the CLUSTer cohort

2024-01-22

Background Teachers are responsible for educating future generations and therefore play an important role in a country’s education system. Teachers constitute about 2.6% of all employees in Malaysia, making it one of the largest workforces in the country. While health and well-being are crucial to ensuring teachers’ work performance, reports on non-communicable diseases such as type 2 diabetes mellitus (T2DM) among Malaysian teachers are scarce. Hence, this study focused on the prevalence of T2DM, undiagnosed diabetes mellitus (DM), impaired fasting glucose (IFG), and underlying lifestyle factors associated with these outcomes among Malaysian teachers. Methods This is a cross-sectional study from the CLUSTer cohort. There were 14144 teachers from the Peninsular Malaysia included in this study. The teachers’ sociodemographic and lifestyle characteristics were described using a weighted complex analysis. A matched age group comparison was carried out between teachers and the Malaysian general population on T2DM, undiagnosed DM, and IFG status. Next, the researchers examined the association of lifestyle factors with T2DM and IFG using multivariable logistic regression. Results The prevalence of T2DM, undiagnosed DM, and IFG among the Malaysian teachers were 4.1%, 5.1%, and 5.6%, respectively. The proportions of teachers with T2DM (both diagnosed and undiagnosed) and the IFG increased linearly with age. Teachers had a lower weighted prevalence of T2DM (known and undiagnosed) than the general population. However, teachers were more inclined to have IFG than the general population, particularly those aged 45 years and older. Among all lifestyle indicators, only waist circumference (aOR: 1.14, 95% CI: 1.08, 1.20) was found to be associated with T2DM, whereas waist circumference (aOR: 1.10, 95% CI: 1.05, 1.15) and physical activity [moderately active = (aOR: 0.71, 95% CI: 0.52, 0.98); highly active = (aOR: 0.56, 95% CI: 0.40, 0.80)] were associated with IFG. Conclusions Modifiable lifestyle factors such as abdominal obesity and physical activity were associated with T2DM and IFG. Intervention programs targeting these factors could help reduce future treatment costs and increase productivity.