Aiming for Safe Behaviour

We don't just want human imitation, we want superhuman capabilities. But without unsafe behavior, there is a very fine line between a naughty rithm and one that's finding inevative solutions to problems that humans haven't been able to solve. Part of the reason that this is so difficult is general efectis called goodhart's law. Goodhart's law says when a metric becomes a target, it ceases to be a good metric.

Play episode from 10:02

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app