AXRP - the AI X-risk Research Podcast cover image

20 - 'Reform' AI Alignment with Scott Aaronson

AXRP - the AI X-risk Research Podcast

00:00

The Impossible Transition From AI to AI

The ability to deceive develops gradually, just like every other ability that we've seen. You can ask GPT to try to be deceitful and you can train it or few shot prompted to be deceitfully. And the results are often quite amusing, right? I don't know if you saw this example where GPT was was asked to write a sorting program, but that secretly treats the number five differently from all the other numbers. When it generates this code that has a condition, you know, called not five, right? Yep.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner