Inference by Turing Post

Beyond the Hype: What Silicon Valley Gets Wrong About RAG. Amr Awadallah, founder & CEO of Vectara

Aug 23, 2025
Amr Awadallah, founder and CEO of Vectara and a pioneer at Cloudera, dives deep into the world of retrieval-augmented generation (RAG). He argues that RAG isn't dead, despite trends toward larger context windows, emphasizing its role in separating memory from reasoning for accurate AI. Amr discusses the importance of retrieval with access control for trustworthy AI and critiques DIY RAG implementations. He also shares insights on hallucination detection, proposing guardian agents to enhance reliability while reflecting on the historical roots and future of AI.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Context Windows Aren't A RAG Replacement

  • Bigger context windows don't remove the need to pick relevant information for the model to reason well.
  • RAG separates memory (knowledge) from reasoning and yields better retrieval of key facts.
ADVICE

Protect Data With Controlled Retrieval

  • Use retrieval with access control to prevent prompt attacks from exposing sensitive data.
  • Let the retriever filter and only pass relevant, permitted information to the model.
INSIGHT

Retrieval Is Far More Compute-Efficient

  • Retrieving into the model's context window scales compute roughly quadratically with words.
  • Smart retrieval systems are sublinear and far more efficient for large information sets.
Get the Snipd Podcast app to discover more snips from this episode
Get the app