Data Engineering Podcast cover image

Data Engineering Podcast

Being Data Driven At Stripe With Trino And Iceberg

Jun 16, 2024
Learn how Stripe utilizes Trino and Iceberg for their data lakehouse, including insights on business analytics, challenges with large datasets, optimizing with Iceberg, and transitioning to REST catalog. Discover the advantages of monitoring queries and managing multi-tool ecosystems with Trino and Spark. Explore the challenges and innovations in cloud data management with Trino and Iceberg at Stripe.
53:20

Podcast summary created with Snipd AI

Quick takeaways

  • Trino optimizes query performance and concurrency for efficient data access at Stripe.
  • Iceberg's metadata management aids complex joins and optimizes large datasets for effective analytics.

Deep dives

Starburst: An End-to-End Data Lake Platform on Trino and Apache Iceberg

Starburst offers a Data Lake platform built on Trino, supporting all table formats including Apache Iceberg, Hive, and Delta Lake. Used by teams like Comcast and DoorDash, it scales high-quality data workflows. Trino is mainly for business analytics, transforming big data datasets and dashboards. With its distributed nature, it outperforms Redshift, making ad hoc analytics fast and efficient.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner