Shad Theory, Truthful a I, and Owen Cotton Barrett

Shard theory also proposes a sub agent theory of mine. This has some similarities to brain like a g i safety, and has drawn on some research from this post. Opinion, this is promising so far. Deserves a lot more work to be done on it to try to find a reliable way to implant certain inner values into trained systems. I view shard theory as a useful frame for alignment already, even if it doesn't go anywhere else.

Play episode from 01:23:03

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app