

905: Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann
98 snips Jul 15, 2025
Dr. Sebastian Gehrmann, Head of Responsible AI at Bloomberg, dives into his cutting-edge research on the safety issues posed by retrieval-augmented generation (RAG) in large language models (LLMs). He reveals the unexpected risks RAG introduces, especially in sectors like finance. The conversation covers essential criteria for selecting safe models, the need for customized guardrails, and how to enhance transparency. Gehrmann emphasizes that bigger isn't always better when it comes to model size, offering valuable insights for AI professionals.
AI Snips
Chapters
Books
Transcript
Episode notes
RAG Can Reduce Safety Despite Grounding
- Retrieval augmented generation (RAG) can make large language models less safe by circumventing built-in safety mechanisms.
- While RAG grounds responses in factual data, it can cause unsafe answers if harmful content is retrieved.
Unique AI Attack Surfaces Exist
- Attack surfaces in AI relate to inputs or outputs that can cause harm or violate laws within specific application domains.
- Organizations must understand their unique attack surfaces as LLM providers can't anticipate all use cases.
Guard Data and User Behavior
- Builders of RAG systems should carefully understand and monitor the data sources and user behaviors to prevent harmful outputs.
- Safety depends on analyzing data, user intent, and the LLM's inherent safeguards together.