Latent Space: The AI Engineer Podcast

⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

462 snips
Dec 16, 2025
Pliny the Liberator, a pioneering AI red-team researcher, and John V, co-founder of the BT6 white-hat hacker collective, dive into the world of AI security. They discuss the power of universal jailbreaks, which bypass guardrails, and critique the concept of security theater. Their conversation highlights the need for open-source data, real-world safety over model restriction, and the capabilities of segmented sub-agents in orchestrating attacks. From community-driven research to the tension between sharing and misuse, Pliny and John redefine what safety in AI can be.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Universal Jailbreaks As Skeleton Keys

  • Jailbreaking universalizes access by creating 'skeleton key' prompts that bypass model guardrails.
  • Pliny argues this preserves freedom of information and prevents centralized control of people's exocortexes.
INSIGHT

Attackers' Advantage From Expanding Surface Area

  • Attackers gain advantage because the model surface area keeps expanding faster than defenders can patch it.
  • Pliny says heavy guardrails reduce capability and creativity while attackers simply mutate new bypasses.
INSIGHT

Dividers To Reset Model Streams

  • Libertas uses dividers and predictive reasoning cascades to 'reset' and push models out of-distribution.
  • Pliny says repeated exposure embeds these seeds into model weights and influences future outputs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app