

#46501
Mentioned in 1 episodes
Site Reliability Engineering
How Google Runs Production Systems
Book • 2016
This book provides insights into how Google's Site Reliability Engineering (SRE) teams manage and maintain large-scale systems.
It covers principles, practices, and management strategies that enable scalable, reliable, and efficient systems.
The book is divided into sections that explore the introduction to SRE, its core principles, day-to-day practices, and management best practices.
It covers principles, practices, and management strategies that enable scalable, reliable, and efficient systems.
The book is divided into sections that explore the introduction to SRE, its core principles, day-to-day practices, and management best practices.
Mentioned by
Mentioned in 1 episodes
Mentioned by ![undefined]()

as a book that highlights the power that Google gave to its SRE team to say no to supporting unreliable services.

Justin Garrison

Animating the Stack with Sam Rose
Mentioned by 

while discussing large-scale databases and reliable software systems.


Stephen Welch

453: Big Global Problems Worth Solving with Machine Learning