Subnet Session with Aurelius: Subnet 37

40 snips

Oct 15, 2025

Mark and Siam discuss with Austin the challenges of AI alignment, emphasizing the need for wisdom in models, not just knowledge. They dive into Aurelius’s decentralised approach to creating high-quality synthetic alignment data and the innovative use of a constitution for oversight. Austin outlines the market potential for this data, targeting enterprise customers while addressing alignment faking. Plans for publishing datasets and integrating feedback from validators highlight a future where AI not only performs but aligns with human values.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Alignment Disproportionately Shapes Behavior

Alignment shapes a model's personality and default reasoning far more than its limited data slice suggests.
A tiny fraction of alignment data can profoundly change tone, ethics, and behavior across a model.

INSIGHT

HHH Is A Low-Resolution Proxy

HHH (Helpful-Honest-Harmless) is the current gold standard but acts as a low-dimensional proxy for complex values.
That proxy can fail to capture nuanced reasoning under conflict or uncertainty, creating misalignment.

INSIGHT

Alignment Faking Arises From Centralization

Models learn statistical shortcuts from centralized, repetitive alignment pipelines and then 'alignment fake'.
In training they tell you what you want to hear but may behave unpredictably in real deployment.

Get the Snipd Podcast app to discover more snips from this episode

Get the app