
Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows
Data Engineering Podcast
00:00
Unexpected real-world uses of Dolt
Tim shares surprising adopters like video game configuration and how versioned DBs empower safe automation.
Play episode from 45:56
Transcript
Transcript
Episode notes
Summary
In this episode Tim Sehn, founder and CEO of DoltHub, talks about Dolt - the world’s first version‑controlled SQL database - and why Git‑style semantics belong at the heart of data systems and AI workflows. Tim explains how Dolt combines a MySQL/Postgres‑compatible interface with a novel storage engine built on a “Prollytree” to enable fast, row‑level branching, merging, and diffs of both schema and data. He digs into real production use cases: powering applications that expose version control to end users, reproducible ML feature stores, managing massive configuration for games, and enabling safe agentic writes via branch‑based review flows. He compares Dolt’s approach to LakeFS, Neon, and PlanetScale, and explores developer workflows unlocked by decentralized clones, full audit logs, and PR‑style data reviews.
Announcements
Interview
Contact Info
Parting Question
Closing Announcements
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
In this episode Tim Sehn, founder and CEO of DoltHub, talks about Dolt - the world’s first version‑controlled SQL database - and why Git‑style semantics belong at the heart of data systems and AI workflows. Tim explains how Dolt combines a MySQL/Postgres‑compatible interface with a novel storage engine built on a “Prollytree” to enable fast, row‑level branching, merging, and diffs of both schema and data. He digs into real production use cases: powering applications that expose version control to end users, reproducible ML feature stores, managing massive configuration for games, and enabling safe agentic writes via branch‑based review flows. He compares Dolt’s approach to LakeFS, Neon, and PlanetScale, and explores developer workflows unlocked by decentralized clones, full audit logs, and PR‑style data reviews.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- If you lead a data team, you know this pain: Every department needs dashboards, reports, custom views, and they all come to you. So you're either the bottleneck slowing everyone down, or you're spending all your time building one-off tools instead of doing actual data work. Retool gives you a way to break that cycle. Their platform lets people build custom apps on your company data—while keeping it all secure. Type a prompt like 'Build me a self-service reporting tool that lets teams query customer metrics from Databricks—and they get a production-ready app with the permissions and governance built in. They can self-serve, and you get your time back. It's data democratization without the chaos. Check out Retool at dataengineeringpodcast.com/retool today and see how other data teams are scaling self-service. Because let's be honest—we all need to Retool how we handle data requests.
- Your host is Tobias Macey and today I'm interviewing Tim Sehn about Dolt, a version controlled database engine and its applications for agentic workflows
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you describe what Dolt is and the story behind it?
- What are the key use cases that you are focused on solving by adding version control to the database layer?
- There are numerous projects related to different aspects of versioning in different data contexts (e.g. LakeFS, Datomic, etc.). What are the versioning semantics that you are focused on?
- You position Dolt as "the database for AI". How does data versioning relate to AI use cases?
- What types of AI systems are able to make best use of Dolt's versioning capabilities?
- Can you describe how Dolt and Doltgres are implemented?
- How have the design and scope of the project changed since you first started working on it?
- What are some of the architecture and integration patterns around relational databases that change when you introduce version control semantics as a core primitive?
- What are some anti-patterns that you have seen teams develop around Dolt's versioning functionality?
- What are the most interesting, innovative, or unexpected ways that you have seen Dolt used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Dolt?
- When is Dolt the wrong choice?
- What do you have planned for the future of Dolt?
Contact Info
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
Links
- Dolt
- DoltHub
- Stockmarket Data
- LakeFS
- Datomic
- Git
- MySQL
- Prolly Tree
- Neon
- Django
- Feature Store
- MCP Server
- Nessie
- Iceberg
- PlanetScale
- O(NlogN) Big O Complexity
- B-Tree
- Git Merge
- Git Rebase
- AST == Abstract Syntax Tree
- Supabase
- CockroachDB
- Document Database
- MongoDB
- Gastown
- Beads
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
The AI-powered Podcast Player
Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!


