Is Chain of Thought Deception a Problem ByDefault?

This seems like it gets us a little further, because we can now apply some r l, which might cause deception that is now catchable. One problem identified by john wentworth with this strategy is that the a i might cause large bad sihde effects, like wiping out humans without intent. This seems like a problem, but this project still is clear progress toward solving alignment eli's opinion,. Though significantly more promising on shorter time lines, where something like current llm's scale to a g is future of humanity.

Play episode from 49:44

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app