A Linear Alignment Solution for Deep Learning

An alined a i should be able to play the game without ruining the fun or doing something obviously destructive like completely taking over the world. Andrew critch is primarily concerned about a multi polar a i scenario. Creating this standardized test environment where alignment failures are observable is one component of a good global outcome. The more recent stuff seems less relevant and cultured.

Play episode from 44:54

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app