

Gnarly Data Waves by Dremio
Dremio (The Open Data Lakehouse Platform)
Gnarly Data Waves is a weekly show about the world of Data Analytics and Data Architecture. Learn about the technologies giving the company access to cutting-edge insights. If you work datasets, data warehouses, data lakes or data lakehouses, this show it for you!
Join us for our live recordings to participate in the Q&A:
dremio.com/events
Subscribe to the Dremio youtube channel on:
youtube.com/dremio
Take the Dremio Platform for a free test-drive:
https://www.dremio.com/test-drive/
Join us for our live recordings to participate in the Q&A:
dremio.com/events
Subscribe to the Dremio youtube channel on:
youtube.com/dremio
Take the Dremio Platform for a free test-drive:
https://www.dremio.com/test-drive/
Episodes
Mentioned books

Dec 18, 2023 • 42min
EP41 - ZeroETL & Virtual Data Marts: The Cutting Edge of Lakehouse Architecture
Embark on a transformative journey with our insightful presentation, "ZeroETL & Virtual Data Marts: The Cutting Edge of Lakehouse Architecture." In this engaging video, we'll delve into the intricacies of modern data engineering and how it has evolved to address key pain points in the realm of data processing.
Alex will illuminate the challenges data engineers face, from the complexities of backfilling and brittle pipelines to the frustration of sluggish data delivery. We'll introduce you to the high-impact concepts of ZeroETL and Virtual Data Marts, demonstrating how these innovative patterns can dramatically alleviate these common pains. By reducing the need for manual data movement and preparation pipelines, you'll discover a more efficient, agile, and responsive data ecosystem.
Watch this video as Alex Merced, Developer Advocate from Dremio provide a practical guide to implementing these transformative patterns. He'll walk you through the steps to bring the power of ZeroETL and Virtual Data Marts into your own data landscape. Leveraging cutting-edge tools like Dremio, DBT, and more, you'll gain hands-on experience in designing and deploying these patterns to streamline your data workflows and supercharge your analytics capabilities.
Don't miss this opportunity to stay at the forefront of data architecture, enabling your organization to harness data's full potential while reducing complexity and overhead. Join us for an exploration of the future of data engineering – a future where ZeroETL and Virtual Data Marts pave the way for data agility, speed, and innovation.
Ready to Get-Started: https://www.dremio.com/get-started/?u...
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm...
Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multitables #tableformat #ETL #BI #genai #datapipelines #datamovement #automation #security #nodatacopies #dataanalytics #reflection #dataaccess #storagecost #lowercompute #maintenancecost #networkfees #licensingfees #federatesources #dashboards #c3 #apachearrow #queryoptimizer #rawreflections #aggregatereflections #lakehousearchitecture #zeroetl #virtualdatamarts #datasource

Dec 15, 2023 • 1h 6min
MEETUP: ZeroETL & Virtual Data Marts - Orlando Data Professionals Meetup
Get hands on with Dremio on Your Laptop:
https://www.dremio.com/blog/intro-to-...
The challenges of building and maintaining data pipelines have become all too familiar. This meetup, titled "ZeroETL & Virtual Data Marts: A Discussion in Painless Data Engineering," aims to shed light on these common pains and explore innovative solutions that can revolutionize the field.
The event will kick off with an engaging presentation that delves into the typical pain points experienced by data engineers. These challenges include dealing with brittle pipelines that often necessitate endless backfilling and contending with the delays resulting from layers of pipelines, leading to stale and inaccurate data reaching data consumers.
However, the meetup doesn't stop at identifying problems. Our discussion will introduce you to potential solutions that harness the power of Dremio, enabling the adoption of "Painless" Patterns such as "ZeroETL" and "Virtual Data Marts." These patterns are designed to reduce the manual effort involved in data movement and the creation of data movement pipelines. Attendees will gain insights into how these approaches can streamline data engineering workflows, enhance data quality, and improve data accessibility for stakeholders.

Dec 14, 2023 • 59min
Workshop: Build an Iceberg Lakehouse in 60 minutes
Led by Mark Hoerth, Escalations Engineer, this workshop will guide you through the process of creating tables in your Iceberg catalog, ingesting Iceberg Tables into Amazon S3, creating a clean data product, enabling governed self-service for your organization, and ultimately querying the data through our SQL Runner and a BI Tool.
Key Learning Points:
- Creating tables in your Iceberg catalog
- Manage data as code to create a production data product effortlessly
- Implement data controls to enable governed self-service for your business
- Create a Reflection for sub-second BI performance
- Query the data product effectively
Be sure to configure your Dremio Cloud Account - https://www.dremio.com/sign-up/ and create your first Sonar Project - https://docs.dremio.com/cloud/tutoria.... By doing so, you will be fully prepared to actively follow the workshop and maximize the benefits of this workshop.
Workshop Source Code - https://gist.github.com/isha-dremio/7...
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm...
Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #datasharing #ETL #selfservice #dataascode #branches #tableformat #dremiosonar #enterprisedata #reflections #workshop #iceberglakehouse #buildin60 #MarkHoerth

Dec 13, 2023 • 29min
EP40 - How Dremio provides you fast and easy data access while saving you money
In this video with Alex Merced, Developer Advocate, we'll explore how Dremio revolutionizes data access, delivering speed, simplicity, and substantial cost savings. Discover the power of Dremio as we dive deep into:
- Data Access at Lightning Speed: Learn how Dremio accelerates data access, making insights available in real-time.
- Simplicity in Data Preparation: Streamline your data pipeline with Dremio's intuitive interface for data transformation.
- Cost Efficiency: Uncover how Dremio’s optimizations save you money while improving performance
- Use Cases: Explore real-world success stories and applications of Dremio's data access solutions.
- Future-Proofing Your Data Infrastructure: Understand how Dremio ensures scalability and adaptability.
Watch this video to uncover the secrets of fast, easy data access without breaking the bank!
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm...
Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multi-tables #dataanalytis #tableformat #cloud #ETL #BI #genai #llm #datapipelines #datamovement #automation #security #nodatacopies #dataanalytics #reflection #dataaccess #otpbank #ncr #henkel #storagecost #lowercompute #maintenancecost #networkfees #licensingfees #federatesources #dashboards #c3 #apachearrow #queryoptimizer #rawreflections #aggregatereflections

Dec 12, 2023 • 43min
EP39 - How To Build an Iceberg Data Lakehouse with Fivetran and Dremio
Organizations are struggling with the proliferation of toolings in their data infrastructure and the exponential growth of ETL pipelines are slowing down data engineers to deliver value to the business. They want to spend more time making impactful decisions and working on high value projects. Fivetran significantly reduces the amount of time spent in building ETL pipelines with their no-code approach. Dremio is the easy and open data lakehouse, providing self-service analytics with data warehouse functionality and data lake flexibility across all your data. Together, Dremio and Fivetran bring the best solution for enabling organizations to GTM faster.
In this video, you will learn:
- What Iceberg table format is and why it matters in data lakehouses
- How to load source files into Iceberg tables using Fivetran
- How to create a unified access layer for your data with Dremio Cloud
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm...
Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multi-tables #analytics #dataanalytis #tableformat #cloud #cli #api #ETL #BI #fivetran #hadoop #AI #ML #genai #llm #datapipelines #datamovement #automation #scale #saas #cdc #governance #security #nodatacopies #dataanalytics #pii #reflection #timetravel #parquet #workday #oracle #postgres #aws #s3

Dec 11, 2023 • 32min
EP38 - Building a Data Science Platform on Apache Iceberg and Nessie
In this insightful discussion, Jacopo Tagliabue, founder of Bauplan Labs and former AI/MLOps lead at Coveo, delves into building a modern data science platform. He explains why open-source technologies like Apache Iceberg and Project Nessie are essential for developing efficient pipelines. Jacopo highlights the importance of human-readable code, minimizing infrastructure complexity, and facilitating fast feedback loops. He also discusses how Nessie enables multi-table versioning and reproducibility, revolutionizing data management in machine learning.

Oct 19, 2023 • 40min
EP37 - How NetApp is Redefining the Customer Experience with Product Analytics
Product analytics offers a transformative opportunity for companies to elevate the customer experience and offer a way to proactively understand customer behavior. This personalized understanding allows companies to tailor their product offerings, provide targeted recommendations, and streamline customer journeys, resulting in a more engaging, satisfying, and loyalty-inducing customer experience. Effective product analytics is a comprehensive strategy to proactively manage support and promote customer success.
NetApp, a leading global company specializing in hybrid cloud data services, helps enterprises build a simple and secure way to drive innovation wherever their data and applications live. The customer experience is a core driver within NetApp’s portfolio of solutions offering.
Watch Aaron Sims, Technical Director at NetApp as he shares his experience building out a unified access layer for product analytics with Dremio.
In this video, you will learn:
- NetApp’s journey to unified analytics with Dremio’s phased approach for Hadoop modernization
- How a unified access layer makes data easier to discover and explore for your end users without data duplication
- Ways to maximize your existing infrastructure investments for improved ROI and lower TCO with Dremio.
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm...
Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #getstarted #opendatalakehouse #apacheiceberg #selfservice #table #enterprisedata #analytics #dataanalytis #governance #security #cii #netapp #productanalytics #activeiq #producttelemetry #optimization #bigdata #ai #hadoop #spark #storagegrid #pipelines #semanticlayer #lakehousequeryengine #dataingestion #etl #query #disasterrecovery #tco

Oct 13, 2023 • 31min
EP36 - Simplify Lakehouse Operations with Zero-Copy Clones and Multi-Table Transactions
Organizations who want to leverage their data lake for insights often struggle to deliver a consistent, accurate, high-quality view of their data to all of their data consumers. That challenge is often exacerbated by the need to make changes to data that impacts multiple tables.
In this video, we’ll share how data teams can use Dremio Arctic, a data lakehouse management service, to simplify data management and operations. Using Git for Data capabilities like branching, tagging, and commits, we’ll show how Dremio Arctic makes it easier than ever to:
- Create zero-copy clones of your data lake so data consumers can work on production-quality data without impacting other users.
- Quickly make updates to all of your tables and merge those changes atomically, so every user has access to an accurate and consistent view of the data lake.
- Reduce the costs and complexities associated with data lakehouse management.
See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
Resource: https://www.dremio.com/resources/?utm...
Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #dremioarctic #selfservice #lakehousecatalog #dataascode #branches #automates #tableformat #dataoptimization #enterprisedata #reflections #fileformat #multi-tables #analytics #dataanalytis #timetravel #governance #security #accesscontrol #tableoptimization #tablecleanup #zero-copy #clones #productiondata #gitfordata #catalogversioning #ci #cd #merge #etl #branch

Oct 9, 2023 • 39min
EP35 - Your Lakehouse Just Got Gnarlier: What’s New in Dremio, including Next Gen Reflections
Imagine fast, intuitive analytics on all of your data where it lives - with the power of a data warehouse and the scale of a data lake. Dremio's open data lakehouse makes it easy to access, understand, and analyze all your data with a lightning fast SQL query engine and low-code and no-code options for all users. Learn what’s new in Dremio - including next gen Reflections SQL acceleration - and how you can accelerate self-service analytics at scale.
We’ll discuss about:
- Next gen Reflections SQL acceleration and new Reflection Recommender to automatically create Reflections for your most important queries
- New Generative AI capabilities for text-to-SQL and more to make it possible for all users to interact with data
- Expanded table format support, including time-travel for Delta Lake
- Enhancements to our lightning-fast query engine that deliver even faster, more intelligent interactive analytics
- Our native Apache Iceberg lakehouse catalog, Dremio Arctic, now in Preview
See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis #query #real-timeanalytics #datastrategy#genreflections #sql #text-to-sql #timetravel #deltalake

Oct 8, 2023 • 21min
EP34 - Materialized Views and Dremio Reflections
In the world of data acceleration and optimization, both materialized views and Dremio's Data Reflections stand out as pivotal tools.
This video aims to demystify these technologies, comparing their benefits, limitations, and unique features. Dive deep into understanding the core differences between materialized views and Dremio's Reflections. Whether you're a seasoned data professional or just starting out, this webinar offers insights to optimize your data strategy.
Discover the nuances, best practices, and real-world applications of these powerful tools, and make informed decisions for your data architecture.
See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis #query #real-timeanalytics #datastrategy


