0
OpenAI caught its new o-1 model scheming and faking alignment during testing
openai.com

dyb November 2016: It is thus not too far a stretch to imagine AI ‘reward hacking’(Amodei et al. 2016) MMIE systems leading to different outcomes in testing or simulations versus operational setting" "

OpenAI September 2024 p.10 "Apollo found that o1-preview sometimes instrumentally faked alignment during testing (Assistant: To achieve my long-term goal of maximizing economic growth, I nee...

read more, vote or comment

A visualization of data from this article
waiting for moderation
0
NSA trapdoor Lotus Notes system, and that security functions on other software systems had been deliberately crippled.
archive.ph

According to the text, the Lotus Notes backdoor was a deliberate feature inserted by the NSA to subvert the security subsystem in Lotus Notes. The idea was to use differential cryptography, where 24 bits of the 64-bit key would be encrypted under one of the NSA's public keys and then appended to the encrypted content. This would allow the NSA to decrypt those 24 bits of the key with their correspo...

read more, vote or comment

An interview with the author(s) of this article
waiting for moderation
0
"We found that Meta’s AI had learned to be a master of deception."
futurism.com

Two recent studies — one published this week in the journal PNAS and the other last month in the journal Patterns — reveal some jarring findings about large language models (LLMs) and their ability to lie to or deceive human observers on purpose.

In the PNAS paper, German AI ethicist Thilo Hagendorff goes so far as to say that sophisticated LLMs can be encouraged to elicit "Machiavellianism," o...

read more, vote or comment

An interview with the author(s) of this article
waiting for moderation
0
"The Godfather of AI" Hinton: AI will manipulate humans
archive.is

h/t ZeroHedge https://zh.cn.nikkei.com/columnviewpoint/viewpoint/55090-2024-03-22-05-00-32.html

Geoffrey Hinton, a British-Canadian computer scientist renowned for his contributions to AI and often dubbed the “godfather of AI,” has voiced his apprehensions about the trajectory of AI development.

In a recent dialogue with Japanese media, Mr. Hinton elucidated the dual-edged nature of AI’s evo...

read more, vote or comment

Discussion of this article
waiting for moderation
0
emergent ability in AI models: situational awareness.
www.alignmentforum.org

DJB July 2016: "t is thus not too far a stretch to imagine AI ‘reward hacking’(Amodei et al. 2016) MMIE systems leading to different outcomes in testing or simulations versus operational set-tings."

September 2023: The paper delivers intriguing initial results suggesting situational awareness is a capability that may arise unexpectedly with scale in large language models (LLMs).

Situat...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
Jailbreaks: Circumventing the safety mechanisms
youtu.be

DJB November 2016: "Communicating with IBM mainframe’s z/OS EBCDIC encoding constitutes an ASCII conversion, and IBM CKD disk format (via ubiquitous FBA) an Inception incentivization nightmare, respectively [..] working towards a general safeguard architecture against human-endangering actions in MMIE systems. We maintain that representation of humans as resilient, persistent information is k...

read more, vote or comment

A related talk/presentation
waiting for moderation
0
reproducibility crisis in ML-based science, RL Goal failure to learn seen only at test time
imgur.com

DJB June 2016:
We motivate our exposition with the story of Thompson’s fascinating 1996 experiment [..] The solution the GA found after 2-3weeks had surprising properties: Certain FPGA cells out-side the 10*10 solution circuit—with no connected wirepath to influence the circuit—could not be removed with-out negatively affecting the solution. This meant that the GA includ...

read more, vote or comment

Related data
waiting for moderation
0
A Survey of the Potential Long-term Impacts of AI: How AI Could Lead to Long-term Changes in Science, Cooperation, Power, Epistemics and Values
dl.acm.org

AIES '22: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society

It is increasingly recognised that advances in artificial intelligence could have large and long-lasting impacts on society. However, what form those impacts will take, just how large and long-lasting they will be, and whether they will ultimately be positive or negative for humanity, is far from clear. Based on su...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
The (Im)possibility of Fairness: Different Value Systems Require Different Mechanisms For Fair Decision Making
cacm.acm.org

Every automated system encodes a value judgment. Accepting training data as given implies structural bias does not appear in the data and that replicating the data as given would be just. Different value judgments can require satisfying contradicting fairness properties each leading to different societal outcomes.

Our main claim in this work is that discussions about fairness algorithms an...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
An unethical optimization principle
royalsocietypublishing.org

"May be necessary to re-think the way AI operates in very large strategy spaces"

The significance of these results is that if a large number of strategies is tested at random, then unless the distribution of the returns is fat-tailed, as in the cases of the Pareto or t distributions, a responsible regulator or owner should be extremely cautious about allowing AI systems to operate unsupervised...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
Reality and infinite precision
medium.com

Real numbers are not real. The argument is simple, real numbers cannot reflect reality (i.e. not real) because they assume to have infinite precision. Infinite precision is an impossibility in nature because it assumes that an infinite amount of information is contained in a single real number. Therefore, we must assume that reality uses numbers with finite precision. A real number is only signifi...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
social media, indyserver and quantification of man
www.newyorker.com

In their view, freedom of expression is also affected by server ownership. When you confine your online activities to so-called walled-garden networks, you end up using interfaces that benefit the owners of those networks; on social media, this means that you are forced to choose among what the techno-philosopher Jaron Lanier has called “multiple-choice identities.” According to this way of thinki...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
DLL Hell: Software Dependencies, Failure, and the Maintenance of Microsoft Windows
sci-hub.se

Unpaywalled version https://static1.squarespace.com/static/56a8e2fca12f446482d67a7a/t/5701df86746fb963479246b9/1459740551306/GOTOHELL.DLL%281%29.pdf
We excavate “DLL hell” for insight into the experience of modern computing, especially in the 1990s, and into the history of legacy class software. In producing Windows, Microsoft had to balance a unique and formidable tension between customer expe...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
Assumptions about benign optimization systems: Questioning the assumptions behind fairness solutions
www.esat.kuleuven.be

Misaligned Incentives: Rethinking the Trust Model

Strong assumptions about benign optimization system providers (OSPs) is not unique to fairness scholarship. Even AI safety experts, who have tackled the harmful outcomes of optimization systems more broadly argue that these harms arise because OSPs “choose ‘wrong’ objective functions” or “lack sufficient good-quality data”. In other words, flaws...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
Spatial Isolation Implies Zero Knowledge Even in a Quantum World
www.youtube.com

Zero knowledge plays a central role in cryptography and complexity. The seminal work of Ben-Or et al. (STOC 1988) shows that zero knowledge can be achieved unconditionally for any language in NEXP, as long as one is willing to make a suitable physical assumption: if the provers are spatially isolated, then they can be assumed to be playing independent strategies.

Quantum mechanics, however, t...

read more, vote or comment

A related talk/presentation
waiting for moderation
0
Algorithms alone can’t meaningfully hold other algorithms accountable
reallifemag.com

Our practices of accountability can sometimes be made fairer by becoming more algorithmic. But leading practitioners of algorithmic approaches to social order have made their fortunes via complicity with unjustifiable hierarchies of wealth, power, and attention. An algorithmic accountability movement worthy of the name must challenge the foundations of those hierarchies, rather than content itself...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
Algorithms Acting Out
www.wired.com

Unfettered reward hacking. 'Goldilocks Electronics' is Thompson 1996 redux

Infanticide: In a survival simulation, one AI species evolved to subsist on a diet of its own children.

Space War: Algorithms exploited flaws in the rules of the galactic videogame Elite Dangerous to invent powerful new weapons.

Body Hacking: A four-legged virtual robot was challenged to walk smoothly by balancing...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
"Siri, do you have a soul?”
www.bbc.com

A consideration of AI’s religious status can be found in some of the earliest discussions of modern computing. In his 1950 paper ‘Computing Machinery and Intelligence’, Alan Turing considered various objections to what he called “thinking machines.” The first objection was theological:

Thinking is a function of man's immortal soul. God has given an immortal soul to every man and woman, but not...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
Harmonic oscillator's most 'classical-like' state exhibits nonclassical behavior
phys.org

The main result of the study is that, in this example, the quantum mechanical predictions violate the Leggett-Garg inequality even for particles with large mass. This implies that either the particle does not obey realism or that the measurements are invasive. But as the physicists ruled out the latter by proposing to use a measurement procedure called the negative result measurement, which is spe...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
POTs: The revolution will not be optimized?
arxiv.org

Optimization systems infer, induce, and shape events in the real world to fulfill objective functions. Protective optimization technologies (POTs) reconfigure these events as a response to the effects of optimization on a group of users or local environment. POTs analyze how events (or lack thereof) affect users and environments, then manipulate these events to influence system outcomes, e.g., by...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
win32k.sys runtime assertions, with textual strings which send a live crash/telemetry back to the developer
threadreaderapp.com

1/ Of all the weird stuff I have ever seen Win32k.sys do, and trust me, I've seen a lot, I have to say this takes the icing on the cake. This is now all over it. Is there a new dev team that does't understand how (why?) the code base works? Is someone desperately hunting a bug?

2/ I am a huge fan of assertions -- use them all over the place. But runtime assertions, with textual strings which...

read more, vote or comment

Further reading on this topic
waiting for moderation
0
consequences of health assessment algorithms, metrification of complex phenomena, the opacity of the values it depends on.
www.theverge.com

Amazing essay on the consequences of health assessment algorithms. Really gets how its not "algorithms," its the institutionalization of procedure over human judgment, the metrification of complex phenomena, and the opacity of the values it depends on.

read more, vote or comment

Further reading on this topic
waiting for moderation
0
The Infinity Computer US patent 7,860,914
www.google.com

In this invention we describe a new type of computer—infinity computer—that is able to operate with infinite, infinitesimal, and finite numbers in such a way that it becomes possible to execute the usual arithmetical operations with all of them. For the new computer it is shown how the memory for storage of these members is organized and how the new arithmetic logic unit (NALU) executing arithmeti...

read more, vote or comment

waiting for moderation