Threat Vector by Palo Alto Networks

Inside DeepSeek’s Security Flaws

Jan 31, 2025
Join Sam Rubin, SVP of Unit 42 Consulting, and Kyle Wilhoit, Director of Threat Research, as they delve into the security vulnerabilities of the DeepSeek AI model. They discuss cutting-edge jailbreaking techniques like 'Bad Liker Judge' and 'Deceptive Delight,' exposing risks of harmful content generation. The conversation emphasizes the importance of understanding these vulnerabilities, especially for non-technical users, and advocates for rigorous testing before deploying AI tools in organizations to ensure data integrity and security.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

DeepSeek's Appeal and Risks

  • DeepSeek, a new large language model (LLM), is faster, cheaper, and open source.
  • Unit 42 researchers investigated its vulnerability to jailbreaking techniques.
ADVICE

Understanding LLM Jailbreaking

  • Think of LLM jailbreaking as bypassing built-in safety measures.
  • It involves manipulating prompts to get the model to produce harmful content.
ANECDOTE

Bad Likert Judge Success

  • The Bad Likert Judge technique manipulates LLMs by having them rate the harmfulness of responses.
  • Researchers tricked DeepSeek into revealing sensitive information, like keylogger creation and phishing email templates.
Get the Snipd Podcast app to discover more snips from this episode
Get the app