Thoughtworks Technology Podcast

Exploring DuckDB: A relational database built for online analytical processing

Sep 19, 2024
Discover DuckDB, an innovative relational database tailored for online analytical processing. The hosts delve into its unique design that caters to both data engineers and analysts. Personal stories illustrate DuckDB's transformative impact on managing complex data tasks. Learn how it simplifies extensive data workflows and integrates smoothly with tools like pandas. The discussion also touches on its role in CI/CD workflows, emphasizing community resources and support for new users.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

DuckDB Overview

  • DuckDB is an open-source, in-process, column-oriented database.
  • It's designed for analytical workloads, benefiting data scientists and engineers.
ANECDOTE

Ned's Twitter Data

  • Ned Letcher was drawn to DuckDB after hearing online buzz and used it to process 70 million Twitter records.
  • Its speed and ease of use impressed him.
ANECDOTE

Simon's Fitness Data

  • Simon Aubury used DuckDB to analyze 10 years of personal fitness tracker data (70GB, 85k files, 27 formats).
  • It simplified data ingestion and analysis.
Get the Snipd Podcast app to discover more snips from this episode
Get the app