

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn
108 snips Jun 7, 2024
Former OpenAI engineer William Saunders sheds light on the prioritization of profit over safety in tech companies. He discusses the 'right to warn' for employees raising AI risk concerns, emphasizing transparency and the need for regulatory protection. The episode explores the challenges of AI safety, confidential whistleblowing, and the impact of independent evaluation on tech product safety.
AI Snips
Chapters
Transcript
Episode notes
Profit Over Safety
- OpenAI employees published an open letter accusing leading AI companies of prioritizing profits over safety.
- This follows departures from OpenAI, including co-founder Ilya Sutskever, amid silencing of whistleblowers.
AI Alignment and Interpretability
- William Saunders, a former OpenAI engineer, worked on the alignment team, focusing on making AI systems do what users want.
- He later transitioned to interpretability research, aiming to understand the inner workings of large language models.
Emergent Capabilities
- AI systems develop emergent capabilities, similar to how genes create complex behaviors like human culture.
- Interpretability research seeks to understand these emergent capabilities, like figuring out a DNA sequence.