Data Engineering Podcast cover image

Data Engineering Podcast

StarRocks: Bridging Lakehouse and OLAP for High-Performance Analytics

May 5, 2025
Sida Shen, a product manager at CelerData and a contributor to StarRocks, dives into the innovative world of high-performance analytical databases. He shares the origins of StarRocks, illustrating its evolution from Apache Doris into a robust Lakehouse query engine. Topics include handling high concurrency and low latency queries, bridging traditional OLAP with lakehouse architecture, and the importance of integration with formats like Apache Iceberg. Sida also emphasizes the challenges of denormalization and real-time data processing in modern analytics.
59:41

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • StarRocks is designed as a high-performance analytical database, utilizing both shared-nothing and shared-data architectures for optimized query performance.
  • The unique architecture of StarRocks separates front-end query management from back-end execution, enhancing scalability and maintaining low latency under heavy loads.

Deep dives

The Challenge of Data Migrations

Data migrations are often lengthy and resource-intensive, causing burnout among teams. Companies experience significant delays during these migrations, sometimes lasting months or years. However, solutions like AI-powered migration agents can accelerate this process dramatically, with some organizations reporting migrations completed up to ten times faster than traditional methods. This efficiency not only improves project timelines but also enhances team morale, as it removes much of the stress typically associated with these projects.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app