

Practical Applications for DuckDB (with Simon Aubury & Ned Letcher)
8 snips Jul 31, 2024
Simon Aubury and Ned Letcher delve into the practical applications of DuckDB, a powerful data-handling tool. They discuss its ability to effortlessly read various formats, automate schema inference, and run efficiently across platforms. The duo covers the differences between transactional and analytical databases and the importance of data contracts. They also highlight DuckDB's extensibility through community-driven extensions and the integration with R for enhanced analytics, all while sharing insights from co-authoring their new book on DuckDB.
AI Snips
Chapters
Transcript
Episode notes
Twitter Social Graph Analysis
- Ned Letcher used DuckDB to analyze his Twitter social graph data.
- He queried 70 million JSON records on his desktop in under a minute.
Fitbit Data Wrangling
- Simon Aubrey downloaded his Fitbit data, a massive, multi-gigabyte zip file.
- It contained 70,000 files in various formats, including JSON and CSV, with inconsistent date and time formats.
Hadoop Use Case
- Simon used DuckDB at work on a Hadoop cluster with sensitive medical data.
- DuckDB efficiently queried the data locally without needing cloud transfer, acting like a local query engine.