Dive into the world of data lakehouses and Apache Iceberg! Discover how these technologies streamline data management by reducing duplication and improving accessibility. Learn about the evolving landscape of hybrid data strategies and the critical role of data governance in optimizing large language models. Explore unique features of data lakehouse platforms that enhance team collaboration and performance. Plus, gain hands-on insights into leveraging Apache Iceberg for impactful analytics!
28:35
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
insights INSIGHT
Data Lakehouse Unifies Tools And Cuts Cost
Data lakehouses add table and catalog layers on top of raw data lakes to restore database-like guarantees.
This lets many tools use a single consistent copy of data, reducing movement and cost.
insights INSIGHT
Lakehouses Shift Analytics Away From Warehouses
Lakehouses don't replace application databases but are displacing analytic data warehouses.
Major warehouse providers are adding Iceberg support to stay relevant with single-copy data demands.
volunteer_activism ADVICE
Pick A Catalog That Implements The REST Spec
Choose an Iceberg-compatible catalog (Nessie, Polaris, or others) to enable broad tool compatibility.
Prefer catalogs implementing the Iceberg REST spec so tools can talk uniformly to your catalog.
Get the Snipd Podcast app to discover more snips from this episode
Topic 1 - Welcome to the show. Tell us a little bit about your background.
Topic 2 - It’s been a little while since we talked about Data Lakehouses, can you give us a little bit of background on this space, and what the most recent dynamics are around these technologies.
Topic 3 - What are the typical integrations with a Data Lakehouse? How are users/developers typically interacting with Data Lakehouse technologies? [The marketplace for Iceberg catalogs like Nessie and Polaris]
Topic 4 - How does an open data format like Apache Iceberg fit into the bigger picture of data lakehouses, or large scale stores of data?
Topic 5 - How does Dremio enable Iceberg? How does Dremio sit in the intersection of Data Lakehouse, Data Mesh and Data Virtualization trends all of which come from the same fundamental problem, the growing scale of data use cases.
Topic 6 - We’ve seen companies start to rethink their data in the cloud strategies. Are you seeing on-premises making a comeback for large data applications