How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion

16 snips

Sep 13, 2024

Amey Chaugule and Matt Green, co-founders of Denormalized, share their extensive engineering backgrounds from top tech firms. They discuss the creation of an embedded stream processing engine designed to simplify real-time data workloads. The duo tackles challenges in existing systems like Spark and Kafka, emphasizing developer experience and state management. They also compare DuckDB and SQLite in the context of streaming data, highlighting the future of user-friendly data tools and the importance of fault tolerance in modern applications.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Amey's Background

Amey Chaugule's career started with Hadoop at Yahoo and spanned ML infrastructure at Uber and Coinbase.
At Uber, he worked on real-time user sessionization, a crucial component for dynamic pricing and supply positioning.

ANECDOTE

Matt's Background

Matt Green, co-founder of Denormalized, has a background in full-stack engineering and data systems at companies like Booster Fuels and Lyft.
At Lyft, he worked on driver incentives, utilizing real-time data and high-scale systems for supply smoothing.

INSIGHT

Stream Processing Use Cases

Stream processing engines handle large data volumes with tight latency requirements, crucial for applications like real-time pricing models and recommendation systems.
They are used in machine learning inference, where models trained on batch data need to process individual events rapidly.

Get the Snipd Podcast app to discover more snips from this episode

Get the app