The Analytics Engineering Podcast

Making Sense of the Last 2 Years in Data

32 snips
Jun 17, 2022
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Separation Enables Specialized Data Systems

  • Data handling needs different systems for different use cases; separation of storage and compute enables that flexibility.
  • Table formats (Delta, Iceberg, Hudi) add transactional guarantees to object storage so diverse engines can share data reliably.
INSIGHT

Table Formats Bring Warehouse Guarantees To Files

  • Table formats (Delta, Iceberg, Hudi) add an abstraction layer over object stores to give files transactional and table-like behavior.
  • That layer makes engines like Spark, Presto, ClickHouse and Druid behave more like SQL systems on the same data.
ADVICE

Match Tools To Use Cases

  • Choose tooling based on your use case rather than trying to force one stack to do everything.
  • Use specialized engines (ClickHouse, Druid, Spark) for workloads that demand their performance/latency profiles.
Get the Snipd Podcast app to discover more snips from this episode
Get the app