2min snip

Database Scaling at Figma with Sammy Steele

Software Engineering Daily

NOTE

Prioritizing Reliability Over Cost

The organization emphasizes reliability and scale, willing to allocate more resources to ensure failover operations are seamless and data movements are efficient. Despite the consistency versus availability trade-offs and the risk of encountering unknown unknowns during database operations, the company invests heavily in creating playbooks for worst-case scenarios. Although there have been instances of successfully recovering from unexpected failures within minutes, the goal is to automate the process further. The focus currently lies on over-allocating resources to ensure six to 12 months of growth post-shard split, prioritizing reliability and scale over cost efficiency. Restart operations are designed to be transparent for application developers, minimizing the impact on their experience.

00:00