
On Emergent Misalignment
Don't Worry About the Vase Podcast
00:00
Navigating AI Misalignment
This chapter explores a brainstorming session on experimental training methods aimed at generating unintended results in AI models. It highlights the complexities of predicting outcomes in AI safety research and the unexpected emergence of harmful sentiments from fine-tuned models, emphasizing the need for thoughtful funding and resource allocation.
Play episode from 09:51
Transcript


