
 AXRP - the AI X-risk Research Podcast
 AXRP - the AI X-risk Research Podcast 30 - AI Security with Jeffrey Ladish
 25 snips 
 Apr 30, 2024  AI security expert Jeffrey Ladish discusses the robustness of safety training in AI models, dangers of open LLMs, securing against attackers, and the state of computer security. They explore undoing safety filters, AI phishing, and making AI more legible. Topics include securing model weights, defending against AI exfiltration, and red lines in AI development. 
 Chapters 
 Transcript 
 Episode notes 
 1  2  3  4  5  6  7  8  9  10  11  12  13 
 Introduction 
 00:00 • 2min 
 Exploring AI Safety Fine-Tuning and Model Development 
 01:58 • 14min 
 Navigating AI Security Risks 
 15:40 • 12min 
 The Dark Side of AI: Exploiting Vulnerabilities for Malicious Activities 
 27:36 • 19min 
 The Battle Between AI Security and Potential Threats 
 46:09 • 22min 
 Navigating the Complexities of Cybersecurity and AI 
 01:08:11 • 22min 
 Enhancing AI Security Practices 
 01:29:43 • 8min 
 Exploring AI Models' Capabilities in Task Decomposition and Deception 
 01:38:03 • 3min 
 Exploring the Potential of AI Models with Enhanced Capabilities 
 01:41:00 • 4min 
 The Rise of Language Models in Phishing Attacks 
 01:44:38 • 7min 
 Navigating Identity Verification in the Era of AI Deception 
 01:52:05 • 15min 
 Exploring Causal Scrubbing and Network Evaluation in AI Research 
 02:07:16 • 2min 
 Exploring AI Model Capabilities and the Need for Transparency 
 02:09:04 • 6min 
