#46501
Mentioned in 1 episodes

Site Reliability Engineering

How Google Runs Production Systems
Book • 2016
This book provides insights into how Google's Site Reliability Engineering (SRE) teams manage and maintain large-scale systems.

It covers principles, practices, and management strategies that enable scalable, reliable, and efficient systems.

The book is divided into sections that explore the introduction to SRE, its core principles, day-to-day practices, and management best practices.

Mentioned by

Mentioned in 1 episodes

Mentioned by
undefined
Justin Garrison
as a book that highlights the power that Google gave to its SRE team to say no to supporting unreliable services.
Animating the Stack with Sam Rose
Mentioned by
undefined
Stephen Welch
while discussing large-scale databases and reliable software systems.
453: Big Global Problems Worth Solving with Machine Learning

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app