Retrieval-Augmented Generation (RAG) has become a dominant architecture in modern AI deployments, and in this episode, we sit down with Douwe Kiela, who co-authored the original RAG paper in 2020. Douwe is now the founder and CEO of Contextual AI, a startup focusing on helping enterprises deploy RAG as an agentic system.
We start the conversation with Douwe's thoughts on the very latest advancements in Generative AI, including GPT 4.5, DeepSeek and the exciting paradigm shift towards test time compute, as well as the US-China rivalry in AI.
We then dive into RAG: definition, origin story and core architecture. Douwe explains the evolution of RAG into RAG 2.0 and Agentic RAG, emphasizing the importance of self-learning systems over individual models and the role of synthetic data. We close with the challenges and opportunities of deploying AI in real-world enterprise, discussing the balance between accuracy and the inherent inaccuracies of AI systems.
Contextual AI
Website - https://contextual.ai
X/Twitter - https://x.com/ContextualAI
Douwe Kiela
LinkedIn - https://www.linkedin.com/in/douwekiela
X/Twitter - https://x.com/douwekiela
FIRSTMARK
Website - https://firstmark.com
X/Twitter - https://twitter.com/FirstMarkCap
Matt Turck (Managing Director)
LinkedIn - https://www.linkedin.com/in/turck/
X/Twitter - https://twitter.com/mattturck
(00:00) Intro
(01:57) Thoughts on the latest AI models: GPT-4.5, Sonnet 3.7, Grok 3
(04:50) The test time compute paradigm shift
(06:47) Unsupervised learning vs reasoning: a false dichotomy
(07:30) The significance of DeepSeek
(10:29) USA vs. China: is the AI war overblown?
(12:19) Controlling AI hallucinations at the model level
(13:51) RAG: definition and origin story
(18:46) Why the Transformers paper initially felt underwhelming
(20:41) The core architecture of RAG
(26:06) RAG vs. fine-tuning vs. long context windows
(30:53) RAG 2.0: Thinking in systems and not models
(31:28) Data extraction and data curation for RAG
(35:59) Contextual Language Models (CLMs)
(38:04) Finetuning and alignment techniques: GRIT, KTO, LENS
(40:40) Agentic RAG
(41:36) General vs. specialized RAG agents
(44:35) Synthetic data in AI
(45:51) Deploying AI in the enterprise
(48:07) How tolerant are enterprises to AI hallucinations?
(49:35) The future of Contextual AI