Data Archives - Software Engineering Daily cover image

Data Archives - Software Engineering Daily

Building a Data Lake with Adam Ferrari

Feb 6, 2024
Adam Ferrari, SVP of Engineering at Starburst, discusses building a Data Lake Analytics platform and the interesting work happening at Starburst. They explore the history and purpose of Starburst, the growth and interest in data lakes, and the challenges of building and maintaining a data lake. They also discuss the scalability, performance, and architecture of Trino, the open-source project that forms the foundation of Starburst. Finally, they highlight the challenges of managing a data lake, including integrating with streaming services and keeping up with evolving lake formats.
46:19

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Starburst is a data lake analytics platform that allows users to work with structured data at scale by leveraging the open source platform Trino.
  • Data lakes provide a scalable solution for managing and analyzing large and diverse datasets, bridging the gap between traditional data warehousing solutions and the increasing volume and variety of data.

Deep dives

Starburst: A Data Lake Analytics Platform

Starburst is a powerful Data Lake Analytics platform built on the open-source technology Trino. Adam Ferrari, the SVP of Engineering at Starburst, discusses the platform's capabilities and its use in working with structured data at scale. Starburst leverages Trino's open source package to provide superpowers for data lake analytics, allowing users to federate and analyze data across various sources, including object storage and structured databases. The platform offers a unified SQL interface for querying and analyzing data, making it a flexible and efficient solution for big data needs.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner