
Google SRE Prodcast
SRE Prodcast brings Google's experience with Site Reliability Engineering together with special guests and exciting topics to discuss the present and future of reliable production engineering!
Latest episodes

Sep 18, 2024 • 31min
Production Problems Are For All! with Ben Treynor Sloss
Ben Treynor Sloss (VP of Engineering, Google) joins hosts Steve McGhee and Dr. Jennifer Petoff (Director of Technical Infrastructure Education, Google) to share the evolution of SRE and its impact on software development, how AI and ML significantly impacts SRE practices, and the future of SRE. Ben coined the term "Site Reliability Engineering" for his team of (now) 4,000 software engineers, engaged in what were traditionally operations functions. Under Ben's leadership, Google SRE wrote two best-selling books on SRE. Since then, the rest of the SaaS industry has come to adopt the SRE name, mission, and practices.

Sep 11, 2024 • 26min
There Remains a Huge Amount of Work to Do, with Healfdene Goguen
In this episode, Healfdene Goguen (Principal Engineer, Google) joins hosts Steve McGhee and Jordan Greenberg to discuss the vast amount of work to be done by SREs, and the fascinating challenges to tackle with clear real-world implications. It's a truly exciting time to be an SRE at Google!

Sep 4, 2024 • 41min
SRE, a Basis of Influence, with Amy Tobey & Vladyslav Ukis
In this season of Google Prodcast, current and former SREs, both within and outside of Google, chat with hosts Steve McGhee and Jordan Greenberg to discuss software systems designed and built by SREs. For "episode zero", guests Amy Tobey (Live Services SRE, Netflix) and Dr. Vladyslav Ukis (Head of R&D, Siemens Healthineers, Author of "Establishing SRE Foundations") will set the stage for the season with a lively discussion about what Software Engineering means to Site Reliability Engineering.

Nov 7, 2023 • 47min
Life of An SRE: Life after Google SRE, with Carla Geisser, Cody Smith, and Laura Nolan
Former Google SREs discuss site reliability engineering outside of Google. They talk about the transferability of knowledge and skills, implementing circuit breaking and throttling techniques for improved reliability, the importance of understanding system behavior and building trust, the concept of system thinking in SRE, advances in internet reliability and security challenges for SREs.

9 snips
Oct 31, 2023 • 51min
Life of An SRE with Sabrina Farmer
Sabrina Farmer, VP of Engineering at Google, talks about her career journey through Site Reliability Engineering. What does management mean? What’s involved in being an effective manager? and what’s a feasibility study? Hear some great advice on how to get what you expect out of a role, wherever on the ladder it is.
Oct 17, 2023 • 30min
Life of An SRE with Dave Reisner
Dave Reisner talks about his path to Staff SRE, from ArchLinux contributor through DevOps to software engineer. This episode emphasizes the value of strong mentoring and manager relationships, and the challenges of work-life balance.

Oct 10, 2023 • 32min
Life of an SRE with Stephen Benjamin
Explore the role and responsibilities of an SRE manager with Stephen Benjamin.

Oct 3, 2023 • 26min
Life of An SRE with Jessica Theodat
Explore the role and responsibilities of a Senior SRE with Jessica Theodat, as she discusses life-work balance, the value of mentoring, and being a Black woman in SRE.

Sep 26, 2023 • 44min
Life of An SRE with Shannon Brady and Theo Klein
Explore the career path of SREs Shannon Brady and Theo Klein as they discusses their paths to Site Reliability Engineering and finding their areas of expertise.
Sep 19, 2023 • 35min
Life of An SRE with Mariuxi Vasconez and Julian Alarcon
In this episode, Mariuxi and Julian discuss their paths to SRE: what drew them initially to SRE, and what motivates them to continue developing skills