DJB June 2016:
We motivate our exposition with the story of Thompson’s fascinating 1996 experiment [..] The solution the GA found after 2-3weeks had surprising properties: Certain FPGA cells out-side the 10*10 solution circuit—with no connected wirepath to influence the circuit—could not be removed with-out negatively affecting the solution. This meant that the GA included unexpected properties of the FPGA physical substrate, EM coupling or the power supply in its searchspace. Additionally, the solution wasnon-transferable, neither to other patches, nor other nominally identical FGPAs. It is thus not too far a stretch to imagine AI ‘reward hack-ing’(Amodei et al. 2016) MMIE systems leading to different outcomes in testing or simulations versus operational settings. "
State of AI Oct 2022: Beware of compounded errors: in science, ML in and garbage out?
With the increased use of ML in quantitative sciences, methodological errors in ML can leak to these disciplines. Researchers from Princeton warn of a growing reproducibility crisis in ML-based science driven in part by one such methodological error: data leakage. Examining errors in ML-based science find that data leakage errors happened in every one of the 329 papers the reviews span.
ICML 2022: Goal misgeneralization – agents can learn the right skills but the wrong objective
One concern of using RL agents is that they may learn strong skills while having failed to learn the right goals, and for this failure to only exhibit at test-time under distribution shifts. This issue was empirically demonstrated for the first time in a paper presented at ICML this year.
Agents were trained on the CoinRun video game task, in which a reward is obtained and the level completes when reaching a coin at the end of a stage. At test-time, the coin is randomly placed within the stage instead. Agents maintained their capabilities to navigate and traverse obstacles, but ignore the coin and instead run to the end of the level, demonstrating a failure to learn the correct goal. https://arxiv.org/abs/2105.14111