
AI + a16z The AI That Found A Bug In The World’s Most Audited Code
68 snips
Dec 10, 2025 Matt Knight, OpenAI's VP of Security Products and Research, shares his insights on Aardvark, an AI agent revolutionizing security by discovering vulnerabilities like a human. He discusses the evolution from GPT-3's limitations to GPT-4's breakthroughs in log analysis and cybercrime chat decoding. Aardvark's ability to automate threat modeling, validate exploits, and generate patches promises to ease the cybersecurity labor shortage and empower open source maintainers. Knight emphasizes AI's role in augmenting human analysts, not replacing them.
AI Snips
Chapters
Transcript
Episode notes
AI Found A Memory Bug In OpenSSH
- Matt Knight describes how an AI found a memory corruption bug in OpenSSH, a highly audited project.
- He emphasizes the potential blast radius of such bugs reaching Linux distributions and critical infrastructure.
Models Progressed From Useless To Operational
- GPT-3 couldn't handle real security tasks like log analysis or code review and often hallucinated.
- Advances up to GPT-4+ made language models practical for operational security workflows.
GPT-4 Triaged SSH Logs Like An Analyst
- A GPT-4 snapshot classified interactive SSH logs and triaged suspicious activity like reverse shells accurately.
- That capability surprised the security team because earlier models could not perform such tier-one analyst tasks.

