4min chapter

Bankless cover image

168 - How to Solve AI Alignment with Paul Christiano

Bankless

CHAPTER

The Importance of Learning to Be Honest

There are lots of cases in which it is incentivized to lie or mislead the human. There's a gap between like lying that will get caught and lying that won't get caught And so you can ask if we train neural nets do they exhibit this kind of switch abruptly? If they get put in a position where they could get away with something really sinister Will they then do it? I think one reason for optimism right now is no one has ever really exhibited that phenomenon in a convincing way. It just is much easier as your models get more competent Like it's only recently that we've trained models which are actually able to understand the mechanics of their training process.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode