AI Agent Security: Threats & Defenses for Modern Deployments

May 21, 2025

Yifeng (Ethan) He, a PhD candidate at UC Davis specializing in software and AI security, and Peter Rong, a researcher focused on vulnerabilities in AI agents, discuss the critical threats facing AI agents. They break down issues like session hijacks and tool-based jailbreaks, highlighting the shortcomings of current defenses. The duo also advocates for effective sandboxing and agent-to-agent protocols, sharing practical strategies for securing AI deployments and emphasizing the importance of a zero-trust approach in agent security.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Agents Are State In Prompts

AI agents encode their state in prompt history and chat context rather than traditional program state.
This makes user and tool inputs critical attack surfaces that can change agent behavior.

ANECDOTE

How Research Started

Peter described starting AI security research after seeing ChatGPT's code-writing evolution and questioning code safety.
That grew into studying agent attack surfaces beyond just insecure code outputs.

INSIGHT

User Data Can Poison Agents

Fine-tuning agents with user interaction opens poisoning and backdoor risks if training data is untrusted.
Malicious preferences or prompts can steer agent behavior subtly over time.

Get the Snipd Podcast app to discover more snips from this episode

Get the app