Data Engineering Podcast cover image

Supporting And Expanding The Arrow Ecosystem For Fast And Efficient Data Processing At Voltron Data

Data Engineering Podcast

00:00

The Biggest Gap in Data Management

I think we're still in a state of learning and change when it comes to how best to build and manage very large data lakes. I'm really hopeful about Apache iceberg, which came out of Netflix and is one of the next generation approaches for large scale data set management. The sooner that we can settle on standards for scalable open data warehousing, so to speak, I think that makes things less of a moving target. And so the world becomes increasingly standardized on iceberg, for example, in file formats like parquet,. Then that simplifies the problem for the engine and user interface developers who want to make an end-to-end stack.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app