Risky Bulletin

Sponsored: Why prompt injection is an intractable problem

Sep 7, 2025
Keith Hoodlet, Director of Engineering for AI, Machine Learning, and AppSec at Trail of Bits, dives into the complexities of prompt injection attacks targeting AI systems. He discusses the evolving landscape of technology and its security challenges, highlighting the difficulties in defending against these attacks. Innovative techniques like the 'line jumping' method and the 'MCP context protector' are explored as potential safeguards. Hoodlet emphasizes the importance of rigorous testing and monitoring to secure AI implementations against these persistent threats.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Hidden Prompt Injection Via Image Scaling

  • Prompt injection can be hidden in non-obvious inputs like images by exploiting predictable processing steps.
  • Models may consume transformed data unseen by humans, making hidden injections effective.
ANECDOTE

Steganography-Style Example From Trail Of Bits

  • The team at Trail of Bits demonstrated hiding dark text that only appears after downscaling, fooling users who saw a safe image.
  • That image then triggered a prompt injection in backend LLMs that users did not observe.
INSIGHT

Models Confuse Data And Instructions

  • LLMs struggle to separate data from instructions because they're optimized to interpret human text probabilistically.
  • Attackers exploit model gradients and edge cases to push models into unintended behaviors.
Get the Snipd Podcast app to discover more snips from this episode
Get the app