How AI Is Built  cover image

How AI Is Built

Latest episodes

undefined
20 snips
Jan 9, 2025 • 1h 14min

AI-Powered Search: Context Is King, But Your RAG System Ignores Two-Thirds of It | S2 E21

Trey Grainger, author of 'AI-Powered Search' and an expert in search systems, joins the conversation to unravel the complexities of retrieval and generation in AI. He presents the concept of 'GARRAG,' where retrieval and generation enhance each other. Trey dives into the importance of user context, discussing how behavior signals improve search personalization. He shares insights on moving from simple vector similarity to advanced models and offers practical advice for engineers on choosing effective tools, promoting a structured, modular approach for better search results.
undefined
9 snips
Jan 3, 2025 • 49min

Chunking for RAG: Stop Breaking Your Documents Into Meaningless Pieces | S2 E20

Brandon Smith, a research engineer at Chroma known for his extensive work on chunking techniques for retrieval-augmented generation systems, shares his insights on optimizing semantic search. He discusses the common misconceptions surrounding chunk sizes and overlap, highlighting the challenges of maintaining context in dense content. Smith criticizes existing strategies, such as OpenAI's 800-token chunks, and emphasizes the importance of coherent parsing. He also introduces innovative approaches to enhance contextual integrity in document processing, paving the way for improved information retrieval.
undefined
Dec 19, 2024 • 48min

How AI Can Start Teaching Itself - Synthetic Data Deep Dive | S2 E18

Adrien Morisot, an ML engineer at Cohere, discusses the transformative use of synthetic data in AI training. He explores the prevalent practice of using synthetic data in large language models, emphasizing model distillation techniques. Morisot shares his early challenges in generative models, breakthroughs driven by customer needs, and the importance of diverse output data. He also highlights the critical role of rigorous validation in preventing feedback loops and the potential for synthetic data to enhance specialized AI applications across various fields.
undefined
5 snips
Dec 13, 2024 • 46min

A Search System That Learns As You Use It (Agentic RAG) | S2 E18

Stephen Batifol, an expert in Agentic RAG and advanced search technology, dives into the future of search systems. He discusses how modern retrieval-augmented generation (RAG) systems smartly match queries to the most suitable tools, utilizing a mix of methods. Batifol emphasizes the importance of metadata and modular design in creating effective search workflows. The conversation touches on adaptive AI capabilities for query refinement and the significance of user feedback in improving system accuracy. He also addresses the challenges of ambiguity in user queries, highlighting the need for innovative filtering techniques.
undefined
Dec 5, 2024 • 47min

Rethinking Search Inside Postgres, From Lexemes to BM25 | S2 E17

Philippe Noël, Founder and CEO of ParadeDB, dives into the revolutionary shift in search technology with his open-source PostgreSQL extension. He discusses how ParadeDB eliminates the need for separate search clusters by enabling search directly within databases, simplifying architecture and enhancing cost-efficiency. The conversation explores BM25 indexing, maintaining data normalization, and the advantages of ACID compliance with search. Philippe also reveals successful use cases, including Alibaba Cloud’s implementation, and practical insights for optimizing large-scale search applications.
undefined
14 snips
Nov 28, 2024 • 51min

RAG's Biggest Problems & How to Fix It (ft. Synthetic Data) | S2 E16

Saahil Ognawala, Head of Product at Jina AI and expert in RAG systems, dives deep into the complexities of retrieval augmented generation. He reveals why RAG systems often falter in production and how strategic testing and synthetic data can enhance performance. The conversation covers the vital role of user intent, evaluation metrics, and the balancing act between real and synthetic data. Saahil also emphasizes the importance of continuous user feedback and the need for robust evaluation frameworks to fine-tune AI models effectively.
undefined
Nov 21, 2024 • 47min

From Ambiguous to AI-Ready: Improving Documentation Quality for RAG Systems | S2 E15

Max Buckley, a Google expert in LLM experimentation, dives into the hidden dangers of poor documentation in RAG systems. He explains how even one ambiguous sentence can skew an entire knowledge base. Max emphasizes the challenge of identifying such "documentation poisons" and discusses the importance of multiple feedback loops for quality control. He highlights unique linguistic ecosystems in large organizations and shares insights on enhancing documentation clarity and consistency to improve AI outputs.
undefined
5 snips
Nov 15, 2024 • 54min

BM25 is the workhorse of search; vectors are its visionary cousin | S2 E14

David Tippett, a search engineer at GitHub with expertise in BM25 and OpenSearch, delves into the efficiency of BM25 versus vector search for information retrieval. He explains how BM25 refines search by factoring in user expectations and adapting to diverse queries. The conversation highlights the challenges of vector search at scale, particularly with GitHub's massive dataset. David emphasizes that understanding user intent is crucial for optimizing search results, as it surpasses merely chasing cutting-edge technology.
undefined
Nov 7, 2024 • 36min

Vector Search at Scale: Why One Size Doesn't Fit All | S2 E13

Join Charles Xie, founder and CEO of Zilliz and pioneer behind the Milvus vector database, as he unpacks the complexities of scaling vector search systems. He discusses why vector search slows down at scale and introduces a multi-tier storage strategy that optimizes performance. Charles reveals innovative solutions like real-time search buffers and GPU acceleration to handle massive queries efficiently. He also dives into the future of search technology, including self-learning indices and hybrid search methods that promise to elevate data retrieval.
undefined
Oct 31, 2024 • 55min

Search Systems at Scale: Avoiding Local Maxima and Other Engineering Lessons | S2 E12

Stuart Cam and Russ Cam, seasoned search infrastructure experts from Elastic and Canva, dive into the complexities of modern search systems. They discuss the integration of traditional text search with vector capabilities for better outcomes. The conversation emphasizes the importance of systematic relevancy testing and avoiding local maxima traps, where improving one query can harm others. They also explore the critical balance needed between performance, cost, and indexing strategies, including practical insights into architecting effective search pipelines.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode