LessWrong (Curated & Popular) cover image

"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky

LessWrong (Curated & Popular)

00:00

The Importance of Random Goals in AI Training

The AI is never forced to pattern match true in quotes poodle avoidance of the kind that nice humans actually have. A sufficiently intelligent agent with any random goal can perform well in an environment, even when that requires acting as though it has another goal. It's generally a good bet that the AI will learn all these little patterns, both the CIS ones and the Puda avoidance ones, in imperfect ways.

Play episode from 22:38
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app