Software Engineering Radio - the podcast for professional software developers cover image

Episode 424: Sean Knapp on Dataflow Pipeline Automation

Software Engineering Radio - the podcast for professional software developers

00:00

Is Automation a Good Idea?

Everything we write into the object store is immutable by design. We use content addressable storage to not only point to the right block, but actually even do things like deduplication. If I'm running multiple pipelines that are actually trying to run the same code and generate the same piece of data, you can do things that deduplicate all of that. The general idea is, yes, anything you can parallelize, you should.

Play episode from 44:46
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app