

How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion
16 snips Sep 13, 2024
Amey Chaugule and Matt Green, co-founders of Denormalized, share their extensive engineering backgrounds from top tech firms. They discuss the creation of an embedded stream processing engine designed to simplify real-time data workloads. The duo tackles challenges in existing systems like Spark and Kafka, emphasizing developer experience and state management. They also compare DuckDB and SQLite in the context of streaming data, highlighting the future of user-friendly data tools and the importance of fault tolerance in modern applications.
AI Snips
Chapters
Transcript
Episode notes
Amey's Background
- Amey Chaugule's career started with Hadoop at Yahoo and spanned ML infrastructure at Uber and Coinbase.
- At Uber, he worked on real-time user sessionization, a crucial component for dynamic pricing and supply positioning.
Matt's Background
- Matt Green, co-founder of Denormalized, has a background in full-stack engineering and data systems at companies like Booster Fuels and Lyft.
- At Lyft, he worked on driver incentives, utilizing real-time data and high-scale systems for supply smoothing.
Stream Processing Use Cases
- Stream processing engines handle large data volumes with tight latency requirements, crucial for applications like real-time pricing models and recommendation systems.
- They are used in machine learning inference, where models trained on batch data need to process individual events rapidly.