High statistical noise limits conclusiveness of ranking results as a benchmarking tool for university management

Office of the Rectorate, University of Vienna, Vienna, Austria
Department of Anthropology, University of Vienna, Vienna, Austria
DOI
10.7287/peerj.preprints.938v1
Subject Areas
Science and Medical Education, Science Policy, Statistics
Keywords
Times Higher Education Ranking, ARWU Ranking, Shanghai Ranking, Regression analysis, Statistical fluctuations
Copyright
© 2015 Sorz et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Sorz J, Fieder M, Wallner B, Seidler H. 2015. High statistical noise limits conclusiveness of ranking results as a benchmarking tool for university management. PeerJ PrePrints 3:e938v1

Abstract

Regression analyses of results from the Times Higher Education (THES)-Ranking and Shanghai University’s Academic Ranking of World Universities (ARWU)-Ranking from 2010-2014 show fluctuations in the rank and score for lower scoring universities (below position 50) which lead to inconsistent “up and downs” in the total results, especially in the THES-Rankings. Furthermore year-to-year results do not correspond in THES- and ARWU-Rankings for universities below rank 50. We conclude that the observed fluctuations in the THES do not correspond to actual university performance and ranking results are thus of limited conclusiveness for the university management of lower scoring universities. We suggest that THE and ARWU alter their ranking procedure insofar as universities below position 50 should be ranked summarized only in groups of 25 or 50. The year to year changes in the ARWU scores are very small, so essential changes from year to year could not be expected, so therefore we argue to publish the ranking less frequently. Additionally, we argue for introducing a standardization process for ranking data in both rankings by using common suitable reference data to create calibration curves represented by non-linearity or linearity.

Author Comment

This is a submission to PeerJ for review.