
A Psychopathological Approach to Safety in AGI
Data Skeptic
Psychopathological Modeling of AGI Safety
In the paper back in 2018, I along with my co-authors Roman Yampolsky and my then PhD advisor Dr. Arasana Munir proposed that if you're trying to replicate a good chunk of cognitive human cognition in a machine, we should expect at the very least two side effects. One, the resulting system is going to be at least similar in terms of complexity to human cognition. And consequently, this is the second result. Figuring out errors in cognition, things like what we call reward hacking or optimization errors in behavior are not going to be as simple as just looking at weights and biases in the neural network where the policy resides. We need to adopt a higher level
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.