
Gnarly Data Waves by Dremio EP20 - What's New in the Apache Iceberg Project: Updates, PyIceberg, Compute Engines
Jun 7, 2023
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Introduction
00:00 • 2min
Apache Iceberg: What's New?
02:06 • 3min
Apache Iceberg: A Project With a Dynamic Community
04:55 • 3min
Apache Iceberg: New Features and Features
08:17 • 2min
Apache Iceberg's New Branching and Tagging Capabilities
10:01 • 3min
The Importance of Branching in Data Experiments
13:11 • 1min
The Importance of Branching in Data Engineering Workflows
14:34 • 4min
How to Migrate Data From a High Hoodie to an Iceberg Table
18:37 • 4min
The Core Library Changes for Apache S3
23:04 • 2min
Apache Iceberg Adds Change Data Capture to Iceberg Tables
24:41 • 3min
How to Migrate Iceberg Tables From One Catalog to Another Without Copying Data
27:37 • 2min
How Spark Has Improvised in This Particular Scenario
29:41 • 3min
Apache Spark 3.3: Support for Storage Partition Join
32:15 • 4min
Spark and Performance Updates
36:10 • 1min
The PR for Streaming Read in Fling Sync
37:36 • 2min
Dremio Sonar's New PR for Iceberg Table
39:47 • 3min
Optimizing Query Performance With Dremi Sonar
42:41 • 2min
Py Iceberg: A New Language for Data Science
44:52 • 3min
How to Use Pie's Work to Improve Performance
48:13 • 3min
How to Implement Data Quality Checks in Iceberg
51:42 • 3min
