The influence of rater training on inter-and intra-rater reliability when using the rat grimace scale
- Published
- Accepted
- Subject Areas
- Animal Behavior, Veterinary Medicine, Anesthesiology and Pain Management
- Keywords
- rat grimace scale, RGS, refinement, welfare, pain assessment, training, 3Rs, scale validation
- Copyright
- © 2018 Zhang et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. The influence of rater training on inter-and intra-rater reliability when using the rat grimace scale. PeerJ Preprints 6:e26721v2 https://doi.org/10.7287/peerj.preprints.26721v2
Abstract
Rodent grimace scales facilitate assessment of spontaneous pain and can identify a range of acute pain levels. Reported rater training in using these scales varies considerably and may contribute to observed variability in inter-rater reliability. This study evaluated the effect of training on inter-rater reliability with the Rat Grimace Scale (RGS). Two training sets, of 42 and 150 images, were prepared from several acute pain models. Four trainee raters progressed through 2 rounds of training, first scoring 42 images (S1) followed by 150 images (S2a). After each round, trainees reviewed the RGS and any problematic images with an experienced rater. The 150 images were then re-scored (S2b). Four years after training, all trainees re-scored the 150 images (S2c). Inter- and intra-rater reliability was evaluated using the intra-class correlation coefficient (ICC) and ICCs compared with a Feldt test. Inter-rater reliability increased from moderate (0.58 [95%CI: 0.43-0.72]) to very good (0.85 [0.81-0.88]) between S1 and S2b (p < 0.01) and also increased between S2a and S2b (p < 0.01). The action units with the highest and lowest ICCs at S2b were orbital tightening (0.84 [0.80-0.87]) and whiskers (0.63 [0.57-0.70]), respectively. In comparison to an experienced rater the ICCs for all trainees improved, ranging from 0.88 to 0.91 at S2b. Four years later, very good inter-rater reliability was retained (0.82 [0.76-0.84]) and intra-rater reliability was good or very good (0.78-0.87). Training improves inter-rater reliability between trainees, with an associated reduction in 95%CI. Additionally, training resulted in improved inter-rater reliability alongside an experienced rater. Performance was retained after several years. The beneficial effects of training potentially reduce data variability and improve experimental animal welfare.
Author Comment
This new version includes results from an additional analysis to evaluate intra-rater reliability 4 years after the initial training period.