Data Engineering Podcast

Bridging the AI–Data Gap: Collect, Curate, Serve

28 snips
Nov 2, 2025
Omri Lifshitz and Ido Bronstein, co-founders of Upriver, delve into the challenges of bridging the gap between AI's demand for quality data and current organizational practices. They highlight the importance of the middle layer of data curation and semantics, presenting a three-part framework: collect, curate, and serve. The duo discusses scaling from proof of concepts to production, the significance of context in AI responses, and innovative methods for automating data documentation. They envision an AI-first future where data engineers focus on strategic roles and oversee business semantics.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

AI Exposes A Middle-Layer Bottleneck

  • AI expands both data supply and demand but exposes a bottleneck in the middle layer of curation and serving.
  • Organizations must solve semantics and serving to make AI-produced data actually usable.
INSIGHT

Errors Compound In Agentic Pipelines

  • Small AI error rates compound across pipelines and agentic interactions, turning acceptable POC accuracy into unacceptable production failure.
  • Teams must design for compounding errors when scaling AI-powered data flows.
ADVICE

Treat AI Like Engineering Work

  • Apply good engineering practices: build validations, evals, and human-in-the-loop checks to productionize AI workflows.
  • Treat model-led systems like software engineering projects, not one-off experiments.
Get the Snipd Podcast app to discover more snips from this episode
Get the app