Data Engineering Podcast

Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows

21 snips
Feb 1, 2026
Tim Sehn, founder and CEO of DoltHub and creator of Dolt — a version-controlled SQL database — explains why Git-style semantics belong in data systems. He covers row-level branching, merging, and diffs, real production use cases like reproducible ML feature stores and game config, and how branches enable safe agentic writes and PR-style data reviews.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Git Semantics Built Into A SQL Database

  • Dolt combines Git-style version control with a MySQL-compatible SQL engine and a custom Prollytree storage layer.
  • This enables branching, diffs, merges, and commits at table and row granularity for production OLTP workloads.
ADVICE

Use Branching For App Features And ML Reproducibility

  • Use Dolt to add branch/merge/diff capabilities to your application backend to expose version control to end users.
  • Apply tags for ML feature store snapshots to guarantee reproducible training and easy rollback.
INSIGHT

Prollytree Enables Fast Row-Level Diffs

  • Dolt's Prollytree chunks data into 4KB content-addressed blocks and organizes them in a B-tree to enable fast diffs and shared storage.
  • That design lets you diff terabyte-scale tables by finding changed chunks instead of comparing whole files.
Get the Snipd Podcast app to discover more snips from this episode
Get the app