Gnarly Data Waves by Dremio

EP20 - What's New in the Apache Iceberg Project: Updates, PyIceberg, Compute Engines

Jun 7, 2023
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
Apache Iceberg: What's New?
02:06 • 3min
3
Apache Iceberg: A Project With a Dynamic Community
04:55 • 3min
4
Apache Iceberg: New Features and Features
08:17 • 2min
5
Apache Iceberg's New Branching and Tagging Capabilities
10:01 • 3min
6
The Importance of Branching in Data Experiments
13:11 • 1min
7
The Importance of Branching in Data Engineering Workflows
14:34 • 4min
8
How to Migrate Data From a High Hoodie to an Iceberg Table
18:37 • 4min
9
The Core Library Changes for Apache S3
23:04 • 2min
10
Apache Iceberg Adds Change Data Capture to Iceberg Tables
24:41 • 3min
11
How to Migrate Iceberg Tables From One Catalog to Another Without Copying Data
27:37 • 2min
12
How Spark Has Improvised in This Particular Scenario
29:41 • 3min
13
Apache Spark 3.3: Support for Storage Partition Join
32:15 • 4min
14
Spark and Performance Updates
36:10 • 1min
15
The PR for Streaming Read in Fling Sync
37:36 • 2min
16
Dremio Sonar's New PR for Iceberg Table
39:47 • 3min
17
Optimizing Query Performance With Dremi Sonar
42:41 • 2min
18
Py Iceberg: A New Language for Data Science
44:52 • 3min
19
How to Use Pie's Work to Improve Performance
48:13 • 3min
20
How to Implement Data Quality Checks in Iceberg
51:42 • 3min