
How AI Is Built
Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.
Latest episodes

5 snips
May 24, 2024 • 28min
#9 Jorrit Sandbrink on Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack
Jorrit Sandbrink, a data engineer, discusses lake house architecture blending data warehouse and lake, key components like Delta Lake and Apache Spark, optimizations with partitioning strategies, and data ingress with DLT. The podcast emphasizes open-source solutions, considerations in choosing tools, and the evolving data landscape.

12 snips
May 20, 2024 • 37min
#8 Kirk Marple on Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models
Kirk Marple, CEO of Graphlit, discusses using knowledge graphs for enhanced information retrieval, a hybrid data model creating virtual entities, entity extraction using Azure Cognitive Services, metadata-first approach for better data indexing, and challenges in knowledge graph development.

15 snips
May 17, 2024 • 38min
#7 Jon Warghed on Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture
Data engineering expert Nicolay Gerold and software-defined assets expert Jon Erich Kemi Warghed discuss selecting the right tools, implementing data governance, and the concept of software-defined assets. They highlight the importance of data governance, open source tooling, agile data platforms, and software-defined assets like Dagster for simplifying data orchestration and creating business value.

6 snips
May 10, 2024 • 33min
#6 John Wessel on Data Orchestration Tools, Choosing the right one for your needs
John Wessel, founder of Agreeable Data, discusses the evolution of data orchestration tools, the popularity of Apache Airflow, and the challenges of choosing the right orchestrator. They also explore the components of a data orchestrator, the role of AI in data orchestration, managing orchestrators, monitoring, and the future of orchestration tools.

7 snips
May 3, 2024 • 30min
#5 Shahul Es and Jithin James on Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals
Creators of Ragas, Shahul and Jithin, discuss challenges in building LLM applications, emphasizing the importance of evaluation, data quality, and continuous RAG evolution. Practical takeaways include starting with a solid testing strategy and embracing synthetic data to automate test data set creation.

Apr 29, 2024 • 22min
Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2
Weston Pace discusses LanceDB V2, a vector database with new file format enhancing columnar storage for multimodal datasets. Goals include null value support, multimodal data handling, and optimal search performance. Lance V2 allows efficient storage of large data without memory hogging. Benefits of Arrow integration and custom encodings in Python for experimentation.

9 snips
Apr 26, 2024 • 32min
#4 Christopher Gwilliams on AI with Supabase, Postgres Configuration, Real-Time Processing, and more
Christopher Williams, Solutions Architect at Supabase, discusses optimizing Postgres for AI, core components powering real-time solutions, PG Vector magic, and Supabase's future features. Topics include setting up Postgres for AI, real-time processing, Postgres extensions, and the future roadmap of Supabase.

7 snips
Apr 19, 2024 • 36min
#3 Duncan Blythe on AI Inside Your Database, Real-Time AI, Declarative ML/AI
Learn how SuperDuperDB simplifies AI integration into databases, enabling real-time computation for instant data updates. Explore the benefits of embeddings and classifications, future plans for AI-powered databases, and the framework for configuring AI workflows. Discover the challenges in computing embeddings, handling text chunks, declarative machine learning, real-time feature calculation, and advancements in model deployment.

Apr 17, 2024 • 14min
Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1
Supabase acquired OrioleDB, a new storage engine for PostgreSQL. Oriole uses an UNDO log for efficient updates and reduced storage. It offers performance boosters like data compression and easy integration with data lakes. The podcast discusses the benefits of OrioleDB for high throughput databases and potential new use cases.

7 snips
Apr 12, 2024 • 37min
#2 Antonio Bustamante on AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation
Antonio Bustamante, a serial entrepreneur, talks about building bem.ai, a data tool for AI and software. Topics include challenges of integrating semi-structured data, using LLMs in data transformation, reliability of data infrastructure, and interoperability layers for systems.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.