Implicit value updating explains transitive inference performance: The betasort model
- Published
- Accepted
- Subject Areas
- Animal Behavior, Neuroscience, Psychiatry and Psychology
- Keywords
- reinforcement learning, transitive inference, cognition, rhesus macaques
- Copyright
- © 2015 Jensen et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2015. Implicit value updating explains transitive inference performance: The betasort model. PeerJ PrePrints 3:e954v1 https://doi.org/10.7287/peerj.preprints.954v1
Abstract
Transitive inference (the ability to infer that “B>D” given that “B>C” and “C>D”) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, the betasort algorithm, and Q-learning (an established RPE model). Of these, only Q-learning failed to respond above chance during critical test trials. Implications for cognitive/associative rivalries, as well as for the model-based/model-free dichotomy, are discussed.
Author Comment
This is the first version of a preprint.