Uma: I think the most important point that resonates with me that you've made CARTIC is around the different platforms, having different recovery times. So then how do you know how does it behave in your case? Well, one solution would be to maybe apply litmus chaos and see how it behaves in practice. Things improve most of the time, but sometimes they get worse. Uma: As much as we want to be confident from our staging experiments, the best failures happen in production. But always remember that unless you are breaking things confidently, your systems are not reliable.
Transcript
chevron_right
Play full episode
chevron_right
Transcript
Episode notes
In today’s episode, Gerhard is joined by Uma, CEO and co-founder of ChaosNative, as well as Karthik, CTO and also a ChaosNative co-founder. They talk Chaos Engineering and Litmus.
Chaos Engineering is not just for super SREs. It is not meant to prevent outages. And, it is not just about hardware. Chaos Engineering is about testing how reliable your systems are. It’s meant to show you how things fail, including when other dependent systems fail - think cascading failures. This is a good way to discover inconvenient truths about that beautiful code that you wrote. Everything fails, and great insights are to be found when it does.
Changelog++ members get a bonus 1 minute at the end of this episode and zero ads. Join today!
Sponsors:
Render – The Zero DevOps cloud that empowers you to ship faster than your competitors. Render is built for modern applications and offers everything you need out-of-the-box. Learn more at render.com/changelog or email changelog@render.com for a personal introduction and to ask questions about the Render platform.
Sentry – Working code means happy customers. That’s exactly why teams choose Sentry. From error tracking to performance monitoring, Sentry helps teams see what actually matters, resolve problems quicker, and learn continuously about their applications - from the frontend to the backend. Use the code SHIPIT and get the team plan free for three months.
Teleport – Teleport Access Plane lets you access any computing resource anywhere. Engineers and security teams can unify access to SSH servers, Kubernetes clusters, web applications, and databases across all environments. Try Teleport today in the cloud, self-hosted, or open source at goteleport.com
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.