Data Engineering Podcast cover image

Pachyderm with Daniel Whitenack - Episode 1

Data Engineering Podcast

00:00

Pacaderm - What's the Trade-Off of Data Versioning?

The version and capability of the data is handled at least partially by tracking the difts as you apply different manipulations to the data itself. We store maybe about 64 bites of medi data per eight megabites, per 8 megabite block of actual data that you're pushing into version. So it's pretty space efficient in that way. For each analysis, basically youre analyzing data at a certain state. You're not having to kind of scrub through a history of data commits in order to figure out how to process your data.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app