The AI That Found A Bug In The World’s Most Audited Code

77 snips

Dec 10, 2025

Matt Knight, OpenAI's VP of Security Products and Research, shares his insights on Aardvark, an AI agent revolutionizing security by discovering vulnerabilities like a human. He discusses the evolution from GPT-3's limitations to GPT-4's breakthroughs in log analysis and cybercrime chat decoding. Aardvark's ability to automate threat modeling, validate exploits, and generate patches promises to ease the cybersecurity labor shortage and empower open source maintainers. Knight emphasizes AI's role in augmenting human analysts, not replacing them.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

AI Found A Memory Bug In OpenSSH

Matt Knight describes how an AI found a memory corruption bug in OpenSSH, a highly audited project.
He emphasizes the potential blast radius of such bugs reaching Linux distributions and critical infrastructure.

INSIGHT

Models Progressed From Useless To Operational

GPT-3 couldn't handle real security tasks like log analysis or code review and often hallucinated.
Advances up to GPT-4+ made language models practical for operational security workflows.

ANECDOTE

GPT-4 Triaged SSH Logs Like An Analyst

A GPT-4 snapshot classified interactive SSH logs and triaged suspicious activity like reverse shells accurately.
That capability surprised the security team because earlier models could not perform such tier-one analyst tasks.

Get the Snipd Podcast app to discover more snips from this episode

Get the app