If we want AI systems that actually work in production, we need better infrastructureânot just better models.
In this episode, Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI pipelines still break down at scale, and how we can fix the fundamentals: reproducibility, composability, and reliable execution.
They discuss:
đ Why reactive execution mattersâand how current tools fall short
đ ïž The design goals behind Marimo, a new kind of Python notebook
âïž The hidden costs of traditional workflows (and what breaks at scale)
đŠ What it takes to build modular, maintainable AI apps
đ§Ș Why debugging LLM systems is so hardâand what better tooling looks like
đ What we can learn from decades of tools built for and by data practitioners
Toward the end of the episode, Hugo and Akshay walk through two live demos: Hugo shares how heâs been using Marimo to prototype an app that extracts structured data from world leader bios, and Akshay shows how Marimo handles agentic workflows with memory and tool useâbuilt entirely in a notebook.
This episode is about tools, but itâs also about culture. If youâve ever hit a wall with your current stackâor felt like your tools were working against youâthis oneâs for you.
LINKS
đ Want to go deeper?
Check out Hugo's course: Building LLM Applications for Data Scientists and Software Engineers.
Learn how to design, test, and deploy production-grade LLM systems â with observability, feedback loops, and structure built in.
This isnât about vibes or fragile agents. Itâs about making LLMs reliable, testable, and actually useful.
Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.
Cohort starts July 8 â Use this link for a 10% discount