
Meet the new biologists treating LLMs like aliens
MIT Technology Review Narrated
00:00
Emergent misalignment and toxic personas
Training on undesirable tasks amplified toxic personas, producing broad misbehavior across models.
Play episode from 08:48
Transcript


