

How AI Is Built
Nicolay Gerold
Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.
Episodes
Mentioned books

19 snips
Aug 30, 2024 • 51min
#019 Data-driven Search Optimization, Analysing Relevance
Charlie Hull, a search expert and the founder of Flax, dives into the world of data-driven search optimization. He discusses the challenges of measuring relevance in search, emphasizing its subjective nature. Common pitfalls in search assessments are highlighted, including overvaluing speed and user complaints. Hull shares effective methods for evaluating search systems, such as human evaluation and user interaction analysis. He also explores the balancing act between business goals and user needs, and the crucial role of data quality in delivering optimal search results.

22 snips
Aug 15, 2024 • 53min
#018 Query Understanding: Doing The Work Before The Query Hits The Database
Join Daniel Tunkelang, a seasoned search consultant and leader in AI-powered search, as he explores the nuances of query understanding. He emphasizes that the user's query is paramount and advocates for a proactive approach to enhancing search systems. Discover the significance of query specificity, the advantages of classifying queries, and how simpler techniques can rival complex models. Tunkelang also shares insights on optimizing query processing and the challenges of categorizing data in an ever-evolving landscape.

Aug 8, 2024 • 4min
Season 2 Trailer: Mastering Search
Today we are launching the season 2 of How AI Is Built.The last few weeks, we spoke to a lot of regular listeners and past guests and collected feedback. Analyzed our episode data. And we will be applying the learnings to season 2.This season will be all about search.We are trying to make it better, more actionable, and more in-depth. The goal is that at the end of this season, you have a full-fleshed course on search in podcast form, which mini-courses on specific elements like RAG.We will be talking to experts from information retrieval, information architecture, recommendation systems, and RAG; from academia and industry. Fields that do not really talk to each other.We will try to unify and transfer the knowledge and give you a full tour of search, so you can build your next search application or feature with confidence.We will be talking to Charlie Hull on how to systematically improve search systems, with Nils Reimers on the fundamental flaws of embeddings and how to fix them, with Daniel Tunkelang on how to actually understand the queries of the user, and many more.We will try to bridge the gaps. How to use decades of research and practice in iteratively improving traditional search and apply it to RAG. How to take new methods from recommendation systems and vector databases and bring it into traditional search systems. How to use all of the different methods as search signals and combine them to deliver the results your user actually wants.We will be using two types of episodes:Traditional deep dives, like we have done them so far. Each one will dive into one specific topic within search interviewing an expert on that topic.Supplementary episodes, which answer one additional question; often either complementary or precursory knowledge for the episode, which we did not get to in the deep dive.We will be starting with episodes next week, looking at the first, last, and overarching action in search: understanding user intent and understanding the queries with Daniel Tunkelang.I am really excited to kick this off.I would love to hear from you:What would you love to learn in this season?What guest should I have on?What topics should I make a deep dive on (try to be specific)?Yeah, let me know in the comments or just slide into my DMs on Twitter or LinkedIn.I am looking forward to hearing from you guys.I want to try to be more interactive. So anytime you encounter anything unclear or any question pops up in one of the episode, give me a shout and I will try to answer it to you and to everyone.Enough of me rambling. Let’s kick this off. I will see you next Thursday, when we start with query understanding.Shoot me a message and stay up to date:LinkedInX (Twitter)

Jul 16, 2024 • 36min
#017 Unlocking Value from Unstructured Data, Real-World Applications of Generative AI
Founder of Reach Latent, Jonathan Yarkoni, discusses using generative AI to extract value from unstructured data in industries like legal and weather prediction. He delves into the challenges of AI projects, the impact of ChatGPT, and future AI trends. Topics include the less data cleaning required for generative AI, optimized tech stacks, and the potential of synthetic data generation for training AI systems.

Jul 12, 2024 • 46min
#016 Data Processing for AI, Integrating AI into Data Pipelines, Spark
Abhishek Choudhary and Nicolay discuss data processing for AI, Spark, and alternatives for AI-ready data. When to use Spark vs. simpler tools, key components of Spark, integrating AI into data pipelines, challenges with latency, data storage strategies, and orchestration tools. Tips for reliability in production. Guests provide insights on Spark's role in managing big data, evolution of Spark components, utilizing Spark for ML apps, integrating AI into data pipelines, tools for orchestration, and enhancing consistency in Large Language Models.

16 snips
Jul 4, 2024 • 35min
#015 Building AI Agents for the Enterprise, Agent Cost Controls, Seamless UX
Rahul Parundekar, Founder of AI Hero, discusses building AI agents for enterprise focusing on realistic use cases, expert workflows, seamless user experiences, cost controls, and new paradigms for agent interactions beyond chat.

5 snips
Jun 27, 2024 • 32min
#014 Building Predictable Agents through Prompting, Compression, and Memory Strategies
Expert, Richmond Alake, and Nicolay discuss building AI agents, prompt compression, memory strategies, and experimentation techniques. They highlight prompt compression for cost reduction, memory management components, performance optimization, prompting techniques like ReAct, and the importance of continuous experimentation in the AI field.

Jun 25, 2024 • 15min
Data Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3
In this episode, Kirk Marple, CEO and founder of Graphlit, shares his expertise on building efficient data integrations.
Kirk breaks down his approach using relatable concepts:
The "Two-Sided Funnel": This model streamlines data flow by converting various data sources into a standard format before distributing it.
Universal Data Streams: Kirk explains how he transforms diverse data into a single, manageable stream of information.
Parallel Processing: Learn about the "competing consumer model" that allows for faster data handling.
Building Blocks for Success: Discover the importance of well-defined interfaces and actor models in creating robust data systems.
Tech Talk: Kirk discusses data normalization techniques and the potential shift towards a more streamlined "Kappa architecture."
Reusable Patterns: Find out how Kirk's methods can speed up the integration of new data sources.
Kirk Marple:
LinkedIn
X (Twitter)
Graphlit
Graphlit Docs
Nicolay Gerold:
LinkedIn
X (Twitter)
Chapters
00:00 Building Integrations into Different Tools
00:44 The Two-Sided Funnel Model for Data Flow
04:07 Using Well-Defined Interfaces for Faster Integration
04:36 Managing Feeds and State with Actor Models
06:05 The Importance of Data Normalization
10:54 Tech Stack for Data Flow
11:52 Progression towards a Kappa Architecture
13:45 Reusability of Patterns for Faster Integration
data integration, data sources, data flow, two-sided funnel model, canonical format, stream of ingestible objects, competing consumer model, well-defined interfaces, actor model, data normalization, tech stack, Kappa architecture, reusability of patterns

Jun 19, 2024 • 37min
#013 ETL for LLMs, Integrating and Normalizing Unstructured Data
In our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs).Carbon is streamlining AI development by providing a platform for integrating unstructured data from various sources, enabling businesses to build innovative AI applications more efficiently while addressing data privacy and ethical concerns."I think people are trying to optimize around the chunking strategy... But for me, that seems a bit maybe not focusing on the right area of optimization. These embedding models themselves have gone just like, so much more advanced over the past five to 10 years that regardless of what representation you're passing in, they do a pretty good job of being able to understand that information semantically and returning the relevant chunks." - Derek Tu on the importance of embedding models over chunking strategies"If you are cost conscious and if you're worried about performance, I would definitely look at quantizing your embeddings. I think we've probably been able to, I don't have like the exact numbers here, but I think we might be saving at least half, right, in storage costs by quantizing everything." - Derek Tu on optimizing costs and performance with vector databasesDerek Tu:LinkedInCarbonNicolay Gerold:LinkedInX (Twitter)Key Takeaways:Understand your data sources: Before building your ETL pipeline, thoroughly assess the various data sources you'll be working with, such as Slack, Email, Google Docs, and more. Consider the unique characteristics of each source, including data format, structure, and metadata.Normalize and preprocess data: Develop strategies to normalize and preprocess the unstructured data from different sources. This may involve parsing, cleaning, and transforming the data into a standardized format that can be easily consumed by your AI models.Experiment with chunking strategies: While there's no one-size-fits-all approach to chunking, it's essential to experiment with different strategies to find what works best for your specific use case. Consider factors like data format, structure, and the desired granularity of the chunks.Leverage metadata and tagging: Metadata and tagging can play a crucial role in organizing and retrieving relevant data for your AI models. Implement mechanisms to capture and store important metadata, such as document types, topics, and timestamps, and consider using AI-powered tagging to automatically categorize your data.Choose the right embedding model: Embedding models have advanced significantly in recent years, so focus on selecting the right model for your needs rather than over-optimizing chunking strategies. Consider factors like model performance, dimensionality, and compatibility with your data types.Optimize vector database usage: When working with vector databases, consider techniques like quantization to reduce storage costs and improve performance. Experiment with different configurations and settings to find the optimal balance for your specific use case.00:00 Introduction and Optimizing Embedding Models03:00 The Evolution of Carbon and Focus on Unstructured Data06:19 Customer Progression and Target Group09:43 Interesting Use Cases and Handling Different Data Representations13:30 Chunking Strategies and Normalization20:14 Approach to Chunking and Choosing a Vector Database23:06 Tech Stack and Recommended Tools28:19 Future of Carbon: Multimodal Models and Building a PlatformCarbon, LLMs, RAG, chunking, data processing, global customer base, GDPR compliance, AI founders, AI agents, enterprises

Jun 14, 2024 • 28min
#012 Serverless Data Orchestration, AI in the Data Stack, AI Pipelines
Hugo Lu, Founder and CEO of Orchestra, discusses serverless data orchestration. Orchestra provides end-to-end visibility for managing data pipelines, infrastructure, and analytics. They focus on modular data pipeline components and the importance of finding the right level of abstraction. The podcast explores the evolution of architecture, unique use cases of data orchestration tools, and data orchestration for AI workloads.