

Gnarly Data Waves by Dremio
Dremio (The Open Data Lakehouse Platform)
Gnarly Data Waves is a weekly show about the world of Data Analytics and Data Architecture. Learn about the technologies giving the company access to cutting-edge insights. If you work datasets, data warehouses, data lakes or data lakehouses, this show it for you!
Join us for our live recordings to participate in the Q&A:
dremio.com/events
Subscribe to the Dremio youtube channel on:
youtube.com/dremio
Take the Dremio Platform for a free test-drive:
https://www.dremio.com/test-drive/
Join us for our live recordings to participate in the Q&A:
dremio.com/events
Subscribe to the Dremio youtube channel on:
youtube.com/dremio
Take the Dremio Platform for a free test-drive:
https://www.dremio.com/test-drive/
Episodes
Mentioned books

8 snips
Oct 29, 2024 • 39min
EP59: Charting the Course: The Evolution and Future of Apache Iceberg and Polaris
Gene Baptiste-Onafri, a principal software engineer at Dremio and board member of the Apache Software Foundation, discusses the evolution of Apache Iceberg and its future with Polaris. He delves into Iceberg's role in data management, emphasizing its features like hidden partitioning and enhanced query efficiency. Gene also explores Polaris’s architecture, focusing on its access control capabilities and multi-table transactions. The conversation highlights upcoming features and community collaboration, showcasing the innovative shift in the data ecosystem.

Oct 29, 2024 • 15min
EP58 - Transforming the Landscape of Real-Time Data Analytics
Dive into the exciting world of real-time data analytics as Dremio showcases its powerful capabilities within a data lakehouse framework. Watch a live demo where data sourcing and governance come to life! Learn how dynamic reporting is enhanced with a customer dataset, highlighting gender categorization and customer ID counts. Discover the importance of data lineage and governance, and see how features like Apache Iceberg and the virtual semantic layer facilitate seamless updates across datasets.

Oct 8, 2024 • 1h 4min
EP57 - From Hadoop & Hive to Minio & Dremio: Moving Towards a Next Gen Data Architecture
Legacy data platforms often fall short of the performance, processing and scaling requirements for robust AI/ML initiatives. This is especially true in complex multi-cloud (public, private, edge, airgapped) environments.
The combined power of MinIO and Dremio creates a data lakehouse platform that overcomes these challenges, delivering scalability, performance and efficiency to ensure successful AI initiatives.
Watch Brenna Buuck, Sr. Technical Evangelist at MinIO and Alex Merced, Sr. Technical Evangelist at Dremio provided insights on:
- AI Workflows: How a data lakehouse simplifies critical AI tasks like model training, refinement, feature selection and real-time inference for faster decisions
- Scalability and Performance: How a data lakehouse architecture scales seamlessly to meet the fast-growing demands of AI applications
- Data Management Efficiency: How a data lakehouse streamlines data management for IT teams, allowing them to focus on innovation

Sep 10, 2024 • 32min
EP56 - What’s New in Dremio: Improved Automation, Performance + Catalog for Iceberg Lakehouses
Dremio unveiled new features in our latest release that enhance the creation, performance, and management of Apache Iceberg data lakehouses.
You will learn how Dremio delivers market-leading SQL query and write performance, improved federated query security and management as well as streamlined data ingestion by delivering:
- Live Reflections on Iceberg tables that will accelerate performance, ensure up-to-date data and reduce management overhead.
- Result Set Caching that can accelerate query performance up to 28X
- Merge-on-Read that can enhance write and ingestion speed
- Auto Ingest Pipes that eliminate complex pipeline setup and maintenance
- User Impersonation for federated queries that allows for granular permissions, better access control, and user workload tracking

Aug 21, 2024 • 14min
EP55 - Unite Data Across Dremio, Snowflake, Iceberg, and Beyond
Organizations want to empower teams with data at their fingertips and in every part of their business. They want their teams to move quickly with data never as a bottleneck, but an accelerant to decision making —all without the curiosity tax too common in consumption-based cloud platforms. Dremio enables data teams to unify all of their disparate data, from Snowflake to Iceberg and other sources, by combining an intelligent semantic layer with a powerful SQL platform that eliminates silos, optimizes costs through intelligent query acceleration, and enables self-service analytics.
You will how Dremio enables Snowflake users to:
- Unify all of your data from Snowflake and all sources
- Optimize analytics costs and performance
- Use easy self-service analytics for faster time-to-insight
- Ensure Apache Iceberg native compatibility for future-proof data access

Aug 20, 2024 • 51min
EP54 - Mastering Semantic Layers: The Key to Data-Driven Innovation
Learn how to master semantic layers with Dremio. We will provide a high-level overview of their purpose in modern analytics, showing how they act as a bridge between complex data sources and business users.
You’ll learn how semantic layers simplify data access, ensure consistency, and empower users to derive meaningful insights from data, regardless of their technical expertise.
- The definition and core purpose of a semantic layer in data analytics: How it acts as a bridge between complex data and business users, simplifying data access and interpretation.
- Key benefits and use cases of semantic layers: How they enable self-service analytics, ensure data consistency, and accelerate time-to-insight.
- How Dremio's semantic layer technology can transform your data strategy: Dremio makes it easier to manage and leverage your data for faster, data-driven decision-making.

Aug 19, 2024 • 48min
EP53 - Build the next-generation Iceberg lakehouse with Dremio and NetApp
Watch Vishnu Vardhan, Director of Product Management StorageGRID at NetApp and Alex Merced,Senior Technical Evangelist at Dremio, as they explore the future of data lakes and discover how NetApp and Dremio can revolutionize your analytics by delivering the next generation of lakehouse with Apache Iceberg.
Transitioning to a modern data lakehouse environment allows organizations to increase business insight, reduce management complexity, and lower overall TCO of their analytics environments. The growing adoption of Apache Iceberg is a key enabler for building the next generation lakehouse. Its robust feature set, coupled with an open ecosystem for analytics use cases, including ACID transactions, time travel, and schema evolution, continues to drive rapid adoption.
Vishnu and Alex will delve into market trends surrounding Iceberg, as well as key drivers for lakehouse adoption and modernization.
You will learn about:
- Iceberg adoption trends
- NetApp StorageGRID and its benefits
- The Dremio and NetApp data lakehouse solution
- Key Iceberg data lakehouse modernization use cases
- Customer examples

Jul 15, 2024 • 4min
Apache Iceberg Lakehouse crash course
Watch and learn about Apache Iceberg. A 10 part web series designed to help you master Apache Iceberg.
https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_medium=social-free&utm_source=youtube&utm_content=webcast-gdw-se-the-architecture-of-apache-iceberg-apache-hudi-and-delta-lake-intro&utm_campaign=webcast-gdw-se-the-architecture-of-apache-iceberg-apache-hudi-and-delta-lake-intro
The "An Apache Iceberg Lakehouse Crash Course" an in-depth webinar series designed to provide a comprehensive understanding of Apache Iceberg and its pivotal role in modern data lakehouse architectures.
Over the course of ten sessions, you'll explore a wide range of topics:
- foundational concepts like data lakehouses
- table formats to advanced features such as partitioning, optimization, and streaming with Apache Iceberg
- Each session will offer detailed insights into the architecture and capabilities of Apache Iceberg, alongside practical demonstrations of data ingestion using tools like Apache Spark and Dremio.

Jul 3, 2024 • 58min
EP51 - Scania’s Journey in Navigating and Implementing Data Mesh
As the demand for data analytics grows, and with a decentralized approach at its core, Major Swedish manufacturer Scania needed to balance domain autonomy and alignment, while implementing a self-serve data & governance platform, coupled with a unified way of accessing data.
Discover how Scania addressed these challenges by adopting a data mesh strategy, and how using Dremio and Witboost has facilitated their journey. Learn about the cultural shifts, changes, and partnerships that are driving tangible business impacts. Additionally, gain insights and trends from Dremio’s Field CDO and the co-founder and CTO Witboost.
Ready to Get-Started: https://www.dremio.com/get-started/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-waves/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51
Events: https://www.dremio.com/events/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51

15 snips
Jun 25, 2024 • 42min
EP52 - The Best of the Subsurface Data Lakehouse Conference 2024
Join industry leaders from Nomura, NetApp, and Blue Cross in a captivating recap of Subsurface 2024, discussing real-world data lakehouse implementations. Explore transformative potential of open source projects like Apache Iceberg, Apache XTable, and Ibis, and gain valuable insights from experts shaping the future of data.


