Reliability Enablers cover image

Reliability Enablers

Latest episodes

undefined
Jun 29, 2023 • 16min

#6 Building a successful SRE practice through capabilities

We discuss the need for a framework to guide the development of Site Reliability Engineers (SREs) and drive value for organizations. You will learn about our pillar view of areas like observability and service management, to identify areas for improvement and emphasize the importance of focusing on a few key areas at a time. We also discuss the challenges of hiring experienced SRE practitioners and suggest developing existing employees' skills and capabilities to become effective SREs. A capability view of SRE work can help establish a clear career path for SREs within an organization while aligning with acute organizational goals. Timestamps for key conceptsIdentifying SRE Pillars [00:00:20] Discussion of the different technology disciplines or practices that SREs can work in, such as observability, release engineering, service management, DevSecOps, performance and capacity engineering, platform engineering, and developer experience.Focus Areas for SREs [00:02:27] Importance of focusing on a few areas at a time and diving deep into them to identify and overcome challenges. The speakers discuss their current focus areas, which include observability, release engineering, and service management.Developing SRE Practitioners [00:06:00] Discussion of the challenges of hiring experienced SRE practitioners and the suggestion of developing existing employees' skills and capabilities to become effective SREs. The speakers highlight the need for a framework to guide the development of SREs and drive value for the organization.Establishing a Career Path for SREs [00:08:52] The speakers discuss the need to establish a career path for SREs within an organization, including developing existing employees' skills and capabilities to become effective SREs and setting proper expectations for each level of SRE.Collaborating with Other Departments and Teams [00:11:33] The speakers provide ideas for how SREs can collaborate with other departments and teams, including establishing regular communication channels, forming cross-functional teams, and encouraging knowledge sharing as a community within the organization.Reliability as an Organizational Conversation [00:13:20] The speakers emphasize the importance of reliability as an organizational conversation, involving not just engineering but also other partners such as product, care, strategic, and marketing teams, to make products and services for customers reliable. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com
undefined
Jun 15, 2023 • 17min

#5 Where does SRE fit into your organization's structure?

We discuss throughout this episode the different engagement models for Site Reliability Engineering (SRE) and how to contextualize SRE into an organization's structure. Sebastian Vietz, an experienced SRE practitioner, suggests five different engagement models for SRE and emphasizes the importance of considering the cost associated with each model. The hosts also discuss the different types of SREs that can exist within these engagement models, including SRE champions and unicorns. They stress the importance of considering organizational context when implementing SRE and tease a future episode where they will delve deeper into a framework for identifying the capabilities needed to solve SRE-related problems.Timestamps of key conceptsWhere and how SRE fits into an organization [00:00:20]We discuss the importance of considering organizational context when implementing SRE and explore different engagement models for SRE.Center of Excellence for Reliability Engineering [00:02:14]We discuss the idea of a center of excellence for reliability engineering, where a few practitioners take on an advisory role for the organization.Embedded SREs [00:04:14]We discuss the idea of embedding SREs into teams, where each team has an embedded SRE whose focus is to implement reliability engineering principles and best practices.Five SRE Engagement Models [00:08:23]We discuss five different engagement models for SRE, including embedded SREs, a center of excellence, and a consulting or ambassador model.Types of SREs [00:10:25]We discuss different personas that an SRE can take, including champions, advocates, and unicorns.Unicorn SREs [00:13:50]We discuss the rare and sought-after unicorn SREs, who have extensive experience and exposure to different business domains and contexts. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com
undefined
Jun 1, 2023 • 19min

#4 Should organizations care about SRE?

This episode discusses how Site Reliability Engineering (SRE) can be important to organizations. SRE can optimize software operations, reduce costs, support revenue-driving areas, mitigate risks, improve cybersecurity, and enhance customer experiences. We will also cover how to integrate SRE into the organization's culture for continuous improvement and innovation. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com
undefined
May 17, 2023 • 23min

#3 SRE vs DevOps vs Platform Engineering

In this episode of SREpath, Ash and Sebastian discuss the unnecessary debate surrounding Site Reliability Engineering (SRE), DevOps, and platform engineering. They argue that these disciplines should not be pitted against each other, but rather seen as complementary and able to coexist within an organization. The focus should be on continuous improvement, learning from failures, and making things better. The hosts emphasize that practitioners in all three areas share the common goal of improvement and should collaborate rather than compete. They briefly distinguish SRE as focusing on system reliability and scalability, DevOps on collaboration and automation, and platform engineering on building and maintaining infrastructure. The decision to establish dedicated teams for each discipline depends on the organization's scale and needs. The hosts encourage a context-driven approach, where individuals from diverse backgrounds and skill sets can contribute to the SRE field. Ultimately, the key is to prioritize improvement and learning, regardless of labels or titles. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com
undefined
May 4, 2023 • 24min

#2 What is Site Reliability Engineering (SRE) and what is not SRE?

In this episode of the SREpath podcast, Ash and Sebastian explore what Site Reliability Engineering (SRE) is and how it manifests in a highly functional organization. We also cover the controversial issue of what SRE is not. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com
undefined
Apr 20, 2023 • 21min

#1 Introducing the SREpath podcast

Welcome to the first episode of the SREpath podcast! In this episode, we'll introduce you to our podcast hosts and give you their broad-level view of Site Reliability Engineering (SRE). We'll also share some points about how we'll be running future episodes. Whether you're an SRE expert or new to the field, this episode will provide valuable insights into SRE and what you can expect from our podcast series. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app