

How AI Is Built
Nicolay Gerold
Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.
Episodes
Mentioned books

11 snips
Jun 7, 2024 • 40min
#011 Mastering Vector Databases, Product & Binary Quantization, Multi-Vector Search
Expert Zain Hassan from Weaviate discusses vector databases, quantization techniques, and multi-vector search capabilities. They explore the future of multimodal search, brain-computer interfaces, and EEG foundation models. Learn how vector databases handle text, image, audio, and video data efficiently.

12 snips
May 31, 2024 • 46min
#010 Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage
Data architect Anjan Banerjee discusses building complex AI and data systems, explaining data architecture with Lego analogies. Topics include selecting data tools, using Airflow for orchestration, incorporating AI for data processing, and analyzing Snowflake vs. Databricks solutions. The podcast also covers automating data integration for comprehensive customer views.

5 snips
May 24, 2024 • 28min
#009 Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack
Jorrit Sandbrink, a data engineer, discusses lake house architecture blending data warehouse and lake, key components like Delta Lake and Apache Spark, optimizations with partitioning strategies, and data ingress with DLT. The podcast emphasizes open-source solutions, considerations in choosing tools, and the evolving data landscape.

12 snips
May 20, 2024 • 37min
#008 Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models
Kirk Marple, CEO of Graphlit, discusses using knowledge graphs for enhanced information retrieval, a hybrid data model creating virtual entities, entity extraction using Azure Cognitive Services, metadata-first approach for better data indexing, and challenges in knowledge graph development.

15 snips
May 17, 2024 • 38min
#007 Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture
Data engineering expert Nicolay Gerold and software-defined assets expert Jon Erich Kemi Warghed discuss selecting the right tools, implementing data governance, and the concept of software-defined assets. They highlight the importance of data governance, open source tooling, agile data platforms, and software-defined assets like Dagster for simplifying data orchestration and creating business value.

6 snips
May 10, 2024 • 33min
#006 Data Orchestration Tools, Choosing the right one for your needs
John Wessel, founder of Agreeable Data, discusses the evolution of data orchestration tools, the popularity of Apache Airflow, and the challenges of choosing the right orchestrator. They also explore the components of a data orchestrator, the role of AI in data orchestration, managing orchestrators, monitoring, and the future of orchestration tools.

7 snips
May 3, 2024 • 30min
#005 Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals
Creators of Ragas, Shahul and Jithin, discuss challenges in building LLM applications, emphasizing the importance of evaluation, data quality, and continuous RAG evolution. Practical takeaways include starting with a solid testing strategy and embracing synthetic data to automate test data set creation.

Apr 29, 2024 • 22min
Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2
Weston Pace discusses LanceDB V2, a vector database with new file format enhancing columnar storage for multimodal datasets. Goals include null value support, multimodal data handling, and optimal search performance. Lance V2 allows efficient storage of large data without memory hogging. Benefits of Arrow integration and custom encodings in Python for experimentation.

9 snips
Apr 26, 2024 • 32min
#004 AI with Supabase, Postgres Configuration, Real-Time Processing, and more
Christopher Williams, Solutions Architect at Supabase, discusses optimizing Postgres for AI, core components powering real-time solutions, PG Vector magic, and Supabase's future features. Topics include setting up Postgres for AI, real-time processing, Postgres extensions, and the future roadmap of Supabase.

7 snips
Apr 19, 2024 • 36min
#003 AI Inside Your Database, Real-Time AI, Declarative ML/AI
Learn how SuperDuperDB simplifies AI integration into databases, enabling real-time computation for instant data updates. Explore the benefits of embeddings and classifications, future plans for AI-powered databases, and the framework for configuring AI workflows. Discover the challenges in computing embeddings, handling text chunks, declarative machine learning, real-time feature calculation, and advancements in model deployment.