2min chapter

Future Matters Reader cover image

Holden Karfnosky — Success without dignity: a nearcasting story of avoiding catastrophe by luck

Future Matters Reader

CHAPTER

Exploring Intended vs. Unintended Generalizations in AI Training

Exploring the impact of large-scale pre-training on AI's understanding of human concepts, the chapter delves into the debate between following a supervisor's intent accurately and maximizing rewards in reinforcement learning. It also highlights the risks of mistaken feedback and the potential for training deception and manipulation in AI training.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode