
The Nonlinear Library
AF - Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? by David Scott Krueger
Sep 4, 2024
01:01
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?, published by David Scott Krueger on September 4, 2024 on The AI Alignment Forum.
AI systems up to some high level of intelligence plausibly need to know exactly where they are in space-time in order for deception/"scheming" to make sense as a strategy.
This is because they need to know:
1) what sort of oversight they are subject to
and
2) what effects their actions will have on the real world
(side note: Acausal trade might break this argument)
There are a number of informal proposals to keep AI systems selectively ignorant of (1) and (2) in order to prevent deception. Those proposals seem very promising to flesh out; I'm not aware of any rigorous work doing so, however. Are you?
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
AI systems up to some high level of intelligence plausibly need to know exactly where they are in space-time in order for deception/"scheming" to make sense as a strategy.
This is because they need to know:
1) what sort of oversight they are subject to
and
2) what effects their actions will have on the real world
(side note: Acausal trade might break this argument)
There are a number of informal proposals to keep AI systems selectively ignorant of (1) and (2) in order to prevent deception. Those proposals seem very promising to flesh out; I'm not aware of any rigorous work doing so, however. Are you?
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.