
Tool Use - AI Conversations "This is why you need a RAG system" - Apurva Misra
15 snips
Oct 28, 2025 In this conversation, Apurva Misra, founder of Sentick and an AI strategy expert, dives deep into the world of Retrieval-Augmented Generation (RAG). He explains how RAG optimizes context for LLMs, discusses the trade-offs between large context windows and targeted retrieval, and outlines strategies for creating effective embeddings. Apurva also highlights the importance of data quality, feedback loops, and introduces hybrid search combining keywords with semantic intent. A must-listen for anyone interested in enhancing AI workflows!
AI Snips
Chapters
Transcript
Episode notes
Why RAG Complements LLMs
- A RAG system supplies LLMs with proprietary or recent information the base model lacks.
- It chunks documents, embeds them, stores vectors, and retrieves relevant chunks for queries.
Context Windows Have Limits
- Bigger context windows don't fully solve relevance because performance plateaus and models lose focus.
- Pulling only relevant chunks keeps prompts concise and improves accuracy in production.
Chunk Carefully Before Embedding
- Chunk documents thoughtfully and overlap slightly to preserve context across chunks.
- Embed each chunk and store its vector in a vector database as your ingestion pipeline.
