AI Security Podcast

Inside the 29.5 Million DARPA AI Cyber Challenge: How Autonomous Agents Find & Patch Vulns

39 snips
Nov 6, 2025
Michael Brown, Principal Security Engineer at Trail of Bits and leader of the Buttercup project in DARPA's AI Cyber Challenge, shares insights into building autonomous AI systems for vulnerability detection. He reveals how Buttercup, despite its initial skepticism, impressed with high-quality patch generation thanks to a 'best of both worlds' approach combining AI with traditional methods. Michael also discusses the competition's unique challenges, the importance of robust engineering, and practical tips for applying AI in security tasks. The future of Buttercup aims at automatic bug fixes at scale for the open-source community.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Competition Demands Real, Verifiable Fixes

  • DARPA's AICC required fully autonomous systems to find and patch vulnerabilities in open-source repos with proof of exploitability.
  • This structure forced low false positives and made outputs realistic for maintainers to accept.
INSIGHT

Scoring Favored Complete, Fast Remediation

  • Patching was weighted far higher than discovery in semifinals: 6 points for a patch vs 2 for a find.
  • The finals added time decay and bonuses for proofs, incentivizing speed and end-to-end verification.
ANECDOTE

Rapid Build Cycles Delivered Results

  • Trail of Bits built Buttercup v1 in three and a half months for the semifinals.
  • They rebuilt Buttercup v2 in six months before the finals and achieved second place.
Get the Snipd Podcast app to discover more snips from this episode
Get the app