Approaches to describing inter-rater reliability of the overall clinical appearance of febrile infants and toddlers in the Emergency Department.

Paul Walsh; Justin M. Thornton; Nicholas Walker; John Gary McCoy; Joe Baal; Jed Baal; Nanse Mendoza; Faried Banimahd MD; Julie Asato

doi:10.7287/peerj.preprints.444v1

Approaches to describing inter-rater reliability of the overall clinical appearance of febrile infants and toddlers in the Emergency Department.

Paul Walsh ¹, Justin M. Thornton², Nicholas Walker², John Gary McCoy², Joe Baal², Jed Baal², Nanse Mendoza², Faried Banimahd MD³, Julie Asato²

1 Emergency Medicine, UC Davis, Sacramento, California, United States

2 Emergency Medicine, Kern Medical Center, Bakersfield, CA, USA

3 Emergency Medicine, University of California Irvine, Orange, CA, USA

DOI: 10.7287/peerj.preprints.444v1

Published: 2014-07-20
Accepted: 2014-07-20

Subject Areas: Emergency and Critical Care, Epidemiology, Evidence Based Medicine, Pediatrics
Keywords: Gwet's AC, Inter-rater agreement, Cohen’s kappa, Graphical analysis, Emergency medicine, Pediatric, Fever, clinical appearance

Copyright: © 2014 Walsh et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.

Cite this article: Walsh P, Thornton JM, Walker N, McCoy JG, Baal J, Baal J, Mendoza N, Banimahd MD F, Asato J. 2014. Approaches to describing inter-rater reliability of the overall clinical appearance of febrile infants and toddlers in the Emergency Department. PeerJ PrePrints 2:e444v1 https://doi.org/10.7287/peerj.preprints.444v1

Abstract

Objectives To measure inter-rater agreement of overall clinical appearance of febrile children aged less than 24 months and to compare methods for doing so.

Study Design and setting We performed an observational study of inter-rater reliability of the assessment of febrile children in a county hospital emergency department serving a mixed urban and rural population. Two emergency medicine healthcare providers independently evaluated the overall clinical appearance of children less than 24 months of age who had presented for fever. They recorded the initial ‘gestalt’ assessment of whether or not the child was ill appearing or if they were unsure. They then repeated this assessment after examining the child. Each rater was blinded to the other’s assessment. Our primary analysis was graphical. We also calculated Cohen’s κ, Gwet’s agreement coefficient and other measures of agreement and weighted variants of these. We examined the effect of time between exams and patient and provider characteristics on inter-rater agreement.

Results We analyzed 159 of the 173 patients enrolled. Median age was 9.5 months (lower and upper quartiles 4.9-14.6), 99/159 (62%) were boys and 22/159 (14%) were admitted. Overall 118/159 (74%) and 119/159 (75%) were classified as well appearing on initial ‘gestalt’ impression by both examiners. Summary statistics varied from 0.223 for weighted κ to 0.635 for Gwet’s AC2. Inter rater agreement was affected by the time interval between the evaluations and the age of the child but not by the experience levels of the rater pairs. Classifications of ‘not ill appearing’ were more reliable than others.

Conclusion The inter-rater reliability of emergency providers' assessment of overall clinical appearance was adequate when described graphically and by Gwet’s AC. Different summary statistics yield different results for the same dataset.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information

Technical appendices

Appendix 1. Intra-rater agreement (first rater) with row and column percentages and Intra-rater agreement (second rater) with row and column percentages Appendix 2. R code for AC calculation Appendix 3. Stata code for data management and generating graphs Appendix 4. Stata code for simulations of different orders of examiners Appendix 5. Output from Agreestat for various measures of agreement [b]

DOI: 10.7287/peerj.preprints.444v1/supp-1
Download

HIPAA compliant dataset

Stata format available on request

DOI: 10.7287/peerj.preprints.444v1/supp-2
Download

Description of variables and labels

escription of variables and labels

DOI: 10.7287/peerj.preprints.444v1/supp-3
Download

Additional Information

Competing Interests

All authors were or are employees of Kern Medical Center, UC Davis or UC Irvine. We have no financial conflict of interest.

Author Contributions

Paul Walsh conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Justin M. Thornton conceived and designed the experiments, performed the experiments, reviewed drafts of the paper.

Nicholas Walker performed the experiments, reviewed drafts of the paper.

John Gary McCoy performed the experiments, reviewed drafts of the paper.

Joe Baal performed the experiments, reviewed drafts of the paper.

Jed Baal performed the experiments, reviewed drafts of the paper.

Nanse Mendoza performed the experiments, reviewed drafts of the paper.

Faried Banimahd MD performed the experiments, reviewed drafts of the paper.

Julie Asato performed the experiments, reviewed drafts of the paper.

Human Ethics

The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers):

Kern Medical Center IRB approval #10032

Patent Disclosures

The following patent dependencies were disclosed by the authors:

Stata

R

Filemaker

are listed as the software used in data entry, management and analysis. They were not the objects of study.

Funding
This work was supported by The Pediatric Emergency Medicine Research Foundation, Long Beach, CA and by Award Number 5K12HL108964-02 from the National Heart, Lung, and Blood Institute at the National Institutes for Health, the National Center for Advancing Translational Sciences, National Institutes of Health, through grant number UL1 TR000002. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, and Blood Institute or the National Institutes of Health or The Pediatric Emergency Medicine Research Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.