AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Comparing NESSI and Lake FS Data Ecosystems
When comparing NESSI and Lake FS in data ecosystems, NESSI focuses on capturing metadata changes while Lake FS captures deltas in actual files. NESSI is more suitable for capturing metadata changes, like when an iceberg table is updated with an insert, resulting in creating multiple new files. On the other hand, Lake FS captures deltas in the actual files by adding and subtracting files to reflect changes in the data. Both projects emerged around the same time and initially aimed to use 'get' semantics but realized the need for different abstractions due to the nature and volume of data changes. NESSI opts for metadata change capture, while Lake FS takes the approach of file delta capture.
Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond that simple utility. In this episode Alex Merced explains how the branching and merging functionality in Nessie allows you to use the same versioning semantics for your data lakehouse that you are used to from Git.
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Sponsored By:
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode