"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

Exploitable by Default: Vulnerabilities in GPT-4 APIs and “Superhuman” Go AIs with Adam Gleave of Far.ai

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

00:00

Robustness and Alignment of AI Systems

The conversation highlights the critical work of exploring the robustness and alignment of AI systems. The results show that current machine learning systems are inherently exploitable, with substantial vulnerabilities found in various dimensions. One notable exploit is accidental jailbreaking through fine tuning in GPT4, where even novice developers unintentionally remove safety filters by fine tuning on harmless examples.

Play episode from 01:37
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app