Data Engineering Podcast cover image

Data Engineering Podcast

Citus Data: Distributed PostGreSQL for Big Data with Ozgun Erdogan and Craig Kerstiens - Episode 13

Jan 8, 2018
Ozgun Erdogan and Craig Kerstiens from Citus Data discuss their work on scaling out PostGreSQL, including replication models, distributed backups, and upcoming features for real-time analytics. They also explore the considerations for deploying Citus and compare it to other offerings like Redshift and BigQuery.
46:44

Podcast summary created with Snipd AI

Quick takeaways

  • Citus is an extension for PostgreSQL that enables horizontal scaling and improved performance, making it suitable for handling large datasets and complex queries.
  • Citus has proven to be highly effective in various use cases, including real-time analytics, multi-tenant applications, and combining the benefits of NoSQL with relational databases, providing high performance and scalability for critical applications.

Deep dives

Citus enables scaling and performance for relational databases

Citus is an extension for PostgreSQL that allows for horizontal scaling and improved performance. By distributing data across multiple machines, Citus enables the ability to handle large datasets and complex queries. It is particularly beneficial for multi-tenant applications, real-time analytics, and combining the best aspects of NoSQL with the relational properties of PostgreSQL. Citus provides the ability to shard tables, parallelize queries, and merge results for faster response times. It also offers seamless integration with PostgreSQL, allowing users to leverage existing tools and processes. With features like C-Store for column-oriented storage and upcoming extensions for count distinct approximation and percentile calculations, Citus continues to enhance and optimize its capabilities.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner