Data Engineering Weekly

The Future of Data Lakehouses: A Fireside Chat with Vinoth Chandar - Founder CEO Onehouse & PMC Chair of Apache Hudi

Jan 9, 2025
Vinoth Chandar, Founder and CEO of Onehouse and PMC Chair of Apache Hudi, discusses the evolution of lakehouse technology. He shares insights on Apache Hudi's impact on data engineering and explores challenges in building high-scale data ecosystems. The conversation highlights innovations in Hudi 1.0, including enhanced concurrency and update features. Additionally, they delve into the role of open source in the data landscape, emphasizing the importance of standardization and collaboration among emerging data formats.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Hudi's Unexpected Rise

  • Vinoth Chandar didn't initially envision Apache Hudi's impact.
  • GDPR's arrival in 2019-2020 propelled Hudi into the mainstream, driven by the need for structured data lake management.
INSIGHT

Hudi's Strengths

  • Hudi's design is uniquely suited for CDC pipelines and incremental data lake writes.
  • It also excels at batch ETL and incremental transformations.
ADVICE

Complexity vs. Simplicity

  • Don't design complex data systems solely for easy understanding.
  • Hudi's complexity stems from its rich features and database-like capabilities.
Get the Snipd Podcast app to discover more snips from this episode
Get the app