4min chapter

AXRP - the AI X-risk Research Podcast cover image

22 - Shard Theory with Quintin Pope

AXRP - the AI X-risk Research Podcast

CHAPTER

The Limits of Shard Theory in Deep Learning

Shard theory is meant to evoke the idea of like people is almost kind of reflexive. i think that for most systems you could construct out of deep learning components it's still going to be in the shard theory domain okay so i think part of this perception is that during the shard Theory sequence we very much did focus on simple examples of externally clear behaviors with a simple reward function such as juice consumption. You can have shards which activate when you're like thinking in a rude way and push your thoughts towards not being rude without like this necessarily corresponding to any externally visible sort of reward okay. These meta level cognitive processes are built on the workhorse of self-supervised slash reinforcement learn

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode