Machine Learning Street Talk (MLST)

AI Agents Can Code 10,000 Lines of Hacking Tools In Seconds - Dr. Ilia Shumailov (ex-GDM)

65 snips
Oct 4, 2025
Dr. Ilia Shumailov is a former DeepMind AI security researcher now focused on building security tools for AI agents. He delves into the unique challenges posed by AI agents operating 24/7, generating hacking tools at unprecedented speeds. Ilia emphasizes that traditional security measures fall short and discusses new adversarial threats, including prompt injection attacks. He also explores the risks of model collapse and the importance of fine-grained policies for AI behavior, warning that as AI evolves, its unpredictability could lead to significant security vulnerabilities.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Big Models Break In New Ways

  • Modern large models fail in qualitatively different ways than older models, especially in worst-case scenarios.
  • Their stronger instruction following enables new attack vectors that break average-case assumptions.
INSIGHT

Agents Are A Worst-Case Adversary

  • Agents are fundamentally different from humans as adversaries because they operate nonstop and can access many endpoints.
  • They can generate sophisticated hacking tools instantly, removing usual human constraints.
ADVICE

Use Trusted Models For Small Verified Computation

  • Consider using trusted ML third parties to run small, verifiable computations instead of heavy cryptography.
  • Require integrative verification of model, prompt, and inputs to trust outputs for private inference.
Get the Snipd Podcast app to discover more snips from this episode
Get the app