
#491: DuckDB and Python: Ducks and Snakes living together
Talk Python To Me
00:00
Columnar Database Indexing
Columnar databases like DuckDB automatically create rough indexes approximately every 100,000 rows. These indexes store the minimum and maximum values within each chunk, enabling efficient filtering. When querying for specific data ranges (e.g., data from this week), DuckDB checks which blocks contain relevant data based on these min/max values. This method allows skipping irrelevant data, significantly speeding up analytical queries that analyze trends in large datasets. While not row-level indexes, this approach is generally optimal for analytical workloads. An external, adaptive radix tri index (art index) can be added for more specific indexing needs.
Play episode from 51:10
Transcript



