AXRP - the AI X-risk Research Podcast

30 - AI Security with Jeffrey Ladish

25 snips
Apr 30, 2024
AI security expert Jeffrey Ladish discusses the robustness of safety training in AI models, dangers of open LLMs, securing against attackers, and the state of computer security. They explore undoing safety filters, AI phishing, and making AI more legible. Topics include securing model weights, defending against AI exfiltration, and red lines in AI development.
Ask episode
Chapters
Transcript
Episode notes