DataNation - Podcast for Data Engineers, Analysts and Scientists

Alex Merced Podcasts
undefined
Jun 4, 2024 • 0sec

56 – Open Source Apache Iceberg Catalogs (Nessie, Polaris, Gravitino)

Alex Merced discusses the value of Open Source Apache Iceberg catalogs in creating a truly open lakehouse environment without Vendor lock-in. Check out my article on the subject: https://open.substack.com/pub/amdatalakehouse/p/open-source-table-format-open-source?r=h4f8p&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true Follow me on twitter at @amdatalakehouse
undefined
May 16, 2024 • 0sec

55 – Discussing the Apache Iceberg Kafka Connect Connector

In this episode, we delve into the Apache Iceberg Kafka Connector, a critical tool for streaming data into your data lakehouse. We’ll explore how this connector facilitates seamless data ingestion from Apache Kafka into Apache Iceberg, enhancing your real-time analytics capabilities and data lakehouse efficiency. We’ll cover: Join us to understand how the Apache Iceberg […]
undefined
Apr 20, 2024 • 0sec

54 – Major Architectural Differences between Apache Iceberg and Delta Lake (Partition Evolution and Hidden Partitioning)

Alex Merced discusses some of the major differences in how Apache Iceberg and Delta Lake work that lead to: Follow me on social https://bio.alexmerced.com/data
undefined
Apr 17, 2024 • 0sec

53-Why Do Snowflake Bills Get So Large?

Alex Merced discusses the mistakes that makes Snowflake bills get so large. Hands-On Lakehouse Laptop Exercises:– MongoDB with Dremio: https://bit.ly/am-mongodb-dashboard– SQLServer with Dremio: https://bit.ly/am-sqlserver-dashboard– Postgres with Dremio: https://bit.ly/am-postgres-to-dashboard https://bio.alexmerced.com/data
undefined
Mar 28, 2024 • 0sec

52 – Apache Iceberg, Dremio and PuppyGraph

Discussing the benefits of Apache Iceberg's open data ecosystem. Exploring Graph Data Processing with Dremio, Puppy Graph, and Apache Iceberg. Efficiency and Flexibility of Apache Iceberg for data lakes, overcoming data duplication challenges and enabling diverse data modeling possibilities.
undefined
Mar 25, 2024 • 0sec

#1 – intro to catalogs, manifests and metadata. Oh my!

Alex Merced introduces his new podcast exploring open-source data projects like Apache Iceberg. The episode discusses the importance of catalogs, manifests, and metadata in developing advanced data systems affordably. Listeners are encouraged to subscribe for future in-depth explorations of open source project architectures.
undefined
7 snips
Mar 18, 2024 • 0sec

51 – Open Data Standards (Apache Iceberg, Apache Parquet, Apache Arrow, Apache Ibis, Apach Substrait)

Explore the benefits of open data standards like Apache Arrow and Apache Iceberg in the data space, optimizing data transfer efficiency with Apache Arrow Flight and ADBC, enhancing scan planning in data catalogs with Apache Iceberg spec and Apache Ibis, standardizing data frameworks and SQL query processing with Apache Substrate, and the value of standardized open data formats and systems for innovation and efficiency.
undefined
Feb 21, 2024 • 0sec

50 – Thinking about the flow of Streaming/Real-Time Data

Alex thinks on the development of Real-time data pipelines.
undefined
Feb 2, 2024 • 0sec

48 – Understanding how Lakehouse Table Formats are Implemented in your Favorite Tools

Alex Merced discusses how Lakehouse Table Formats like Apache Iceberg, Apache Hudi, and Delta Lake are implemented in favorite tools. The podcast explores Java libraries, file structures, metadata tables, and file slices. It also covers implementing formats in different languages, query performance, and the differences between Apache Iceberg, Hoodie, and Delta formats.
undefined
5 snips
Jan 21, 2024 • 0sec

47 – Understanding your cloud costs (Storage, Egress, Compute, Serverless, etc.)

Exploring cloud costs, distributed file systems, object storage, and tiered storage models. Understanding cost-effective cloud service models and navigating cloud costs. Emphasizing the importance of optimizing data handling for cost efficiency.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app