This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Despite the central role of species distributions in ecology and conservation, occurrence information remains geographically and taxonomically incomplete and biased. Numerous socio-economic and ecological drivers of uneven record collection and mobilization among species have been suggested, but the generality of their effects remains untested. We develop scale-independent metrics of range coverage and geographical record bias, and apply them to 2.8M point-occurrence records of 3,625 mammal species to evaluate 13 putative drivers of species-level variation in data availability. We find that data limitations are mainly linked to range size and shape, and the geography of socio-economic conditions. Surprisingly, species attributes related to detection and collection probabilities, such as body size or diurnality, are much weaker predictors of the amount and range coverage of available records. Our results highlight the need to prioritize range-restricted species and to address the key socio-economic drivers of data bias in data mobilization efforts and distribution modeling.
This manuscript is currently submitted to another journal. Version 2 of the preprint has spelling mistakes corrected.