
Threat Vector by Palo Alto Networks Inside DeepSeek’s Security Flaws
Jan 31, 2025
Join Sam Rubin, SVP of Unit 42 Consulting, and Kyle Wilhoit, Director of Threat Research, as they delve into the security vulnerabilities of the DeepSeek AI model. They discuss cutting-edge jailbreaking techniques like 'Bad Liker Judge' and 'Deceptive Delight,' exposing risks of harmful content generation. The conversation emphasizes the importance of understanding these vulnerabilities, especially for non-technical users, and advocates for rigorous testing before deploying AI tools in organizations to ensure data integrity and security.
AI Snips
Chapters
Transcript
Episode notes
DeepSeek's Appeal and Risks
- DeepSeek, a new large language model (LLM), is faster, cheaper, and open source.
- Unit 42 researchers investigated its vulnerability to jailbreaking techniques.
Understanding LLM Jailbreaking
- Think of LLM jailbreaking as bypassing built-in safety measures.
- It involves manipulating prompts to get the model to produce harmful content.
Bad Likert Judge Success
- The Bad Likert Judge technique manipulates LLMs by having them rate the harmfulness of responses.
- Researchers tricked DeepSeek into revealing sensitive information, like keylogger creation and phishing email templates.
