3min chapter

AXRP - the AI X-risk Research Podcast cover image

4 - Risks from Learned Optimization with Evan Hubinger

AXRP - the AI X-risk Research Podcast

CHAPTER

Is There a Behaviour Objective?

A thermistat is an example of a system that has a behavior objective, but not necessarily a maso objective or an internally represented objective. So i want to ask another question about this e, this sort of interplay between this idea of like, am planning and internally represented objectives. If you're trying to look at a trained model and understand what that model tive is, you you have to like, have transparency tools, or some means of being able to look inside the model that lie gives you a real understanding of what is doing. You can imagine tha situation. I think clearly, the answer is, well, the mase objective should be the, you know, the negative

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode