
AI Security Podcast Inside the 29.5 Million DARPA AI Cyber Challenge: How Autonomous Agents Find & Patch Vulns
39 snips
Nov 6, 2025 Michael Brown, Principal Security Engineer at Trail of Bits and leader of the Buttercup project in DARPA's AI Cyber Challenge, shares insights into building autonomous AI systems for vulnerability detection. He reveals how Buttercup, despite its initial skepticism, impressed with high-quality patch generation thanks to a 'best of both worlds' approach combining AI with traditional methods. Michael also discusses the competition's unique challenges, the importance of robust engineering, and practical tips for applying AI in security tasks. The future of Buttercup aims at automatic bug fixes at scale for the open-source community.
AI Snips
Chapters
Books
Transcript
Episode notes
Competition Demands Real, Verifiable Fixes
- DARPA's AICC required fully autonomous systems to find and patch vulnerabilities in open-source repos with proof of exploitability.
- This structure forced low false positives and made outputs realistic for maintainers to accept.
Scoring Favored Complete, Fast Remediation
- Patching was weighted far higher than discovery in semifinals: 6 points for a patch vs 2 for a find.
- The finals added time decay and bonuses for proofs, incentivizing speed and end-to-end verification.
Rapid Build Cycles Delivered Results
- Trail of Bits built Buttercup v1 in three and a half months for the semifinals.
- They rebuilt Buttercup v2 in six months before the finals and achieved second place.


