4min chapter

The Data Stack Show cover image

127: The Anatomy of a Data Lakehouse with Alex Merced of Dremio

The Data Stack Show

CHAPTER

Querying the Data at Scale

The first step is to basically land your data in a format like parquet that's really built for analytics. Parquet offers you lots of benefits, such as organizing them into different row groups and having metadata. Now from that to being able to query the data and query the data at scale. And when I say at scale, I don't mean like petabytes, but at the scale of the organization like to make it available all over the world. So we were the kind of go back to that foundational level and build up that daily cows.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode