

Sponsored: Why prompt injection is an intractable problem
Sep 7, 2025
Keith Hoodlet, Director of Engineering for AI, Machine Learning, and AppSec at Trail of Bits, dives into the complexities of prompt injection attacks targeting AI systems. He discusses the evolving landscape of technology and its security challenges, highlighting the difficulties in defending against these attacks. Innovative techniques like the 'line jumping' method and the 'MCP context protector' are explored as potential safeguards. Hoodlet emphasizes the importance of rigorous testing and monitoring to secure AI implementations against these persistent threats.
AI Snips
Chapters
Transcript
Episode notes
Hidden Prompt Injection Via Image Scaling
- Prompt injection can be hidden in non-obvious inputs like images by exploiting predictable processing steps.
- Models may consume transformed data unseen by humans, making hidden injections effective.
Steganography-Style Example From Trail Of Bits
- The team at Trail of Bits demonstrated hiding dark text that only appears after downscaling, fooling users who saw a safe image.
- That image then triggered a prompt injection in backend LLMs that users did not observe.
Models Confuse Data And Instructions
- LLMs struggle to separate data from instructions because they're optimized to interpret human text probabilistically.
- Attackers exploit model gradients and edge cases to push models into unintended behaviors.