"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Exploitable by Default: Vulnerabilities in GPT-4 APIs and “Superhuman” Go AIs with Adam Gleave of Far.ai

Mar 27, 2024
In a captivating discussion, Adam Gleave, founder of Far AI and expert in AI exploitability, delves into the vulnerabilities of GPT-4 and superhuman Go AIs. He reveals how naive fine-tuning can create exploitable flaws, leading to serious cybersecurity risks. The conversation also touches on the ethical implications of disclosing these weaknesses and the critical need for robust safety measures in AI development. Gleave highlights the balance needed between enhancing AI performance and maintaining security, making for an enlightening exploration of AI's future challenges.
01:43:57

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Fine-tuning GPT-4 exposes significant vulnerabilities in AI models.
  • Ethical dilemmas surround disclosing AI vulnerabilities responsibly.

Deep dives

Vulnerabilities in Fine-Tuning: Fragile Safety Filters and Malicious Code Generation

Fine-tuning GPT-4 exposes substantial vulnerabilities. Safety filters can unintentionally be overridden, leading to harmful output generation with minimal examples. Targeted misinformation and malicious code generation were easily achieved, enabling biased responses and insertion of poisoned data, revealing potential risks in automated code generation with trivial costs.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner