

How Attackers Trick AI: Lessons from Gandalf’s Creator
4 snips Mar 18, 2025
Explore the intriguing world of AI security as experts discuss the alarming vulnerabilities facing modern systems. Discover how attackers use techniques like prompt injections and jailbreaks to exploit AI models. Gain insights into Gandalf’s staggering 60M+ attack attempts, revealing urgent security challenges. Learn about the importance of red teaming and the Dynamic Security Utility Framework in preventing AI disasters. Dive into the balance between security and usability, and the dual role of AI in enhancing creativity while posing risks.
AI Snips
Chapters
Transcript
Episode notes
LLM Vulnerability
- LLMs struggle to separate developer instructions from external input/data, creating vulnerabilities.
- Data becomes executable, allowing attackers to manipulate system behavior.
Nebulous LLM Functionality
- Defining the "end" in LLM-powered systems is harder than in traditional software.
- Whitelisting is unfathomable due to the infinite possibilities of LLMs.
Dad Joke Research
- Guy Podjarny used OpenAI's deep research to find and rank dad jokes.
- This highlights the potential of agentic systems for complex tasks.