Sponsored: Why prompt injection is an intractable problem

Sep 7, 2025

Keith Hoodlet, Director of Engineering for AI, Machine Learning, and AppSec at Trail of Bits, dives into the complexities of prompt injection attacks targeting AI systems. He discusses the evolving landscape of technology and its security challenges, highlighting the difficulties in defending against these attacks. Innovative techniques like the 'line jumping' method and the 'MCP context protector' are explored as potential safeguards. Hoodlet emphasizes the importance of rigorous testing and monitoring to secure AI implementations against these persistent threats.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Hidden Prompt Injection Via Image Scaling

Prompt injection can be hidden in non-obvious inputs like images by exploiting predictable processing steps.
Models may consume transformed data unseen by humans, making hidden injections effective.

ANECDOTE

Steganography-Style Example From Trail Of Bits

The team at Trail of Bits demonstrated hiding dark text that only appears after downscaling, fooling users who saw a safe image.
That image then triggered a prompt injection in backend LLMs that users did not observe.

INSIGHT

Models Confuse Data And Instructions

LLMs struggle to separate data from instructions because they're optimized to interpret human text probabilistically.
Attackers exploit model gradients and edge cases to push models into unintended behaviors.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Sponsored: Why prompt injection is an intractable problem

Hidden Prompt Injection Via Image Scaling

Steganography-Style Example From Trail Of Bits

Models Confuse Data And Instructions

Show notes