A theory and methodology to quantify knowledge
- Subject Areas
- Computational Biology, Ethical Issues, Science Policy, Statistics, Computational Science
- soft science, hard science, philosophy of science, research misconduct, questionable research practices, reproducibility, pseudo-science, positivism, falsification, relativism
- © 2018 Fanelli
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. A theory and methodology to quantify knowledge. PeerJ Preprints 6:e1968v4 https://doi.org/10.7287/peerj.preprints.1968v4
This article proposes quantitative answers to meta-scientific questions including "how much knowledge is attained by a research field?","how rapidly is a field making progress?", "what is the expected reproducibility of a result?", "how much knowledge is lost from scientific bias and misconduct?" "what do we mean by soft science?", "what demarcates a pseudoscience?".
Knowledge is suggested to be a system-specific property measured by K, a quantity determined by how much the information contained in an explanandum is compressed by an explanans, which is composed of an information "input" and a "theory/methodology" conditioning factor. The three arguments of the K function are quantifiable using methods of classic and algorithmic information theory. This approach is justified on three grounds: 1) K results from postulating that information is finite and knowledge is information compression; 2) K is compatible and convertible to ordinary measures of effect size and algorithmic complexity; 3) K is physically interpretable as a measure of entropic efficiency. Moreover, the K function has useful properties that support its potential as a measure of knowledge.
Examples from a variety of fields are given to illustrate the possible uses of K. These examples include quantifying: the knowledge value of proving Fermat's last theorem; the accuracy of measurements of the electron's mass; the half life of eclipse predictions; the usefulness of evolutionary models of reproductive skew; the significance of gender differences in personality; the sources of irreproducibility in psychology; the impact of scientific misconduct and QRP; the knowledge value of astrology. Furthermore, a cumulative K may complement ordinary meta-analysis and may give rise to a universal classification of sciences and pseudosciences.
Simple and memorable mathematical formulae summarize the theory's key results and implications. In addition to practical uses in meta-research, these formulae may have conceptual applications in philosophy and in research policy, and may guide scientists to make progress on all frontiers of knowledge.
This is an entirely new version of the manuscript, simplified in the theory and giving examples of applications. The theory is maximally simplified and stripped of all non-essential elements. New arguments are presented for the validity of K, new properties are discussed, and proofs are generally made tighter. The Results secition now offers numerous practical examples, using real data or simulations, of how K can be used to answer meta-scientific questions.
R code used to generate all figures and analyses.