Engineering Enablement by DX

DX
undefined
32 snips
Mar 29, 2023 • 45min

Bringing the product management discipline to platform teams | Russ Nealis (Plaid)

As product lead, Russ Nealis has been focused on introducing the discipline of product management in the Developer Foundations organization. This episode discusses the reasons why PMs are currently uncommon in platform organizations, examples of when having a PM has been helpful, and more. Discussion points:(1:23) Russ’s role at Plaid (2:49) Why platform product managers are uncommon(3:28) Backgrounds to look for when hiring a platform PM(4:58) Deciding whether to hire a platform PM(6:20) Signs that bringing in a Product Manager would be beneficial(9:16) How Russ personally became a platform PM(12:15) Whether a platform PM is a career path (14:55) Articulating the business impact a platform PM has(18:56) Challenges Plaid’s platform team has faced without a PM  (19:19) Symptoms of a need for product management in an internal-facing team(30:15) Whether Twilio had platform PMs  (31:22) Example projects where PMs have been crucial(34:12) How the book “Ask Your Developer” influenced Twilio’s engineering culture (36:13) Getting started with introducing a product management discipline to an organization (38:33) Org structure and where platform PMs may report (40:00) Career ladder for platform PM when reporting to engineering leadership(41:20) Being product-led or technology-led(43:14) How technical skills may help when in a platform PM role‍Mentions and links: Follow Russ on LinkedIn Episode 7 with Will Larson - related to why it’s difficult to find Platform PMsEpisode 27 with Jean-Michel Lemieux - related to the percentage of investment that should be put towards platform investments The Build Trap by Melissa PerriAsk Your Developer by Jeff Lawson
undefined
7 snips
Mar 8, 2023 • 1h 10min

Intercom’s approach to a great on-call experience | Brian Scanlan (Intercom)

In this deep-dive episode, Brian Scanlan, Principal Systems Engineer at Intercom, describes how the company’s on-call process works. He explains how the process started and key changes they’ve made over the years, including a new volunteer model, changes to compensation, and more.Discussion points:(1:28) How on-call started at Intercom(10:11) Brian’s background and interest in being on-call(14:06) Getting engineers motivated to be on-call (16:37) Challenges Intercom saw with on-call as it grew(19:53) Having too many people on-call(23:20) Having alarms that aren’t useful (26:03) Recognizing uneven workload with compensation(27:22) Initiating changes to the on-call process (30:08) Creating a volunteer model(33:02) Addressing concerns that volunteers wouldn’t take action on alarms (34:40) Equitability in a volunteer model(36:36) Expectations of expertise for being on-call(40:56) How volunteers sign up (44:15) The Incident Commander role (46:19) Using code review for changes to alarms(50:02) On-call compensation (52:50) Other approaches to compensating on-call(55:08) Whether other companies should compensate on-call(57:32) How Intercom’s on-call process compares to other companies (1:00:46) Recent changes to the on-call process(1:04:13) Balancing responsiveness and burnout (1:07:12) Signals for evaluating the on-call process ‍Mentions and links: Follow Brian on LinkedIn or Twitter Brian’s article: How we fixed our on call process to avoid engineer burnoutGergely Orosz’s On-Call Compensation 
undefined
Feb 16, 2023 • 56min

How Instagram Reels manages reliability | Jack Li (Instagram, Shopify)

Jack Li explains how his production engineering team rolled out a new incident review process, how they’ve made the case for investing in reliability, and specific tools his team has built to improve reliability.—Discussion points:(1:25) How Jack became interested in reliability (3:24) Where the Instagram Reels team fits into the broader organization(4:05) What Jack’s team focuses on(4:55) The role of production engineering at Instagram versus Shopify (8:32) The essence of DevOps(10:44) Pros and cons of having product-focused teams(13:35) How Jack’s team defines and tracks quality(15:46) Signals the team monitors outside of systems (18:10) Revamping Instagram Reel’s incident management process(19:46) Making the case for improving the incident review process(28:10) How their incident review process works(31:55) The roles involved in an incident review (33:40) The value of having incident reviews(35:55) Why leaders should be part of incident reviews (38:34) Why Jack’s team builds tools for driving reliability goals(40:06) The types of tools Jack’s team focuses on (43:09) What a merge queue is and why it was built at Shopify(51:20) Using a Slack bot for ‘failed build’ alerts(52:32) When a company should consider implementing a merge queue—Mentions and links: Follow Jack on LinkedIn Jack’s article from his time on Shopify about their Merge QueueJack’s talk on Shopify’s Merge Queue at GitHub Universe 2019
undefined
69 snips
Jan 25, 2023 • 55min

A masterclass on DORA – research program, common pitfalls, and future direction | Nathen Harvey (Google)

Nathen Harvey, who leads DORA at Google, explains what DORA is, how it has evolved in recent years, the common challenges companies face as they adopt DORA metrics, and where the program may be heading in the future.—Discussion points:(1:48) What DORA is today and how it exists within Google(3:37) The vision for Google and DORA coming together(5:20) How the DORA research program works(7:53) Who participates in the DORA survey(9:28) How the industry benchmarks are identified (11:05) How the reports have evolved over recent years(13:55) How reliability is measured (15:19) Why the 2022 report didn’t have an Elite category(17:11) The new Slowing, Flowing, and Retiring clusters(19:25) How to think about applying the benchmarks(20:45) Challenges with how DORA metrics are used(24:02) Why comparing teams’ DORA metrics is an antipattern (26:18) Why ‘industry’ doesn’t matter when comparing organizations to benchmarks (29:32) Moving beyond DORA metrics to optimize organizational performance (30:56) Defining different DORA metrics(36:27) Measuring deployment frequency at the team level, not the organizational level(38:29) The capabilities: there’s more to DORA than the four metrics (43:09) How DORA and SPACE are related(47:58) DORA’s capabilities assessment tool (49:26) Where DORA is heading—Mentions and links:Follow Nathen on LinkedIn or TwitterEngineering Enablement episode with Dr. Nicole Forsgren2022 State of DevOps report  Bryan Finster’s How to Use & Abuse DORA Metrics (and Abi’s summary of the paper) Engineering Enablement episode with Dr. Margaret-Anne StoreyJoin the DORA community for discussion and events: dora.community 
undefined
30 snips
Jan 18, 2023 • 38min

An inside look at the SPACE framework | Dr. Margaret-Anne Storey (co-author, SPACE)

This week's guest is Dr. Margaret-Anne Storey, who goes by the name Peggy. Peggy is a professor of Computer Science at the University of Victoria, the Chief Scientist at DX, and co-author of the SPACE Framework, which is the topic of focus in this episode. Today’s conversation discusses what the SPACE framework is and what went into developing the metrics and categories. Peggy also shares where she sees this line of research heading next.  —Discussion points: (1:29) Peggy’s background (4:01) What the SPACE framework is (5:55) Why the researchers came together for this paper(7:27) The process of writing this paper(9:52) How the SPACE categories and acronym emerged (11:50) The authors’ intention for how this framework would be received(13:26) Finding a definition for what developer productivity is(17:08) The metrics included in the SPACE framework (24:48) How SPACE is different from DORA(26:17) Why lines of code and number of pull requests were included as example metrics(27:14) What Peggy is thinking about next—Mentions and links: Where to find Peggy: Twitter, WebsiteThe SPACE of Developer Productivity: There’s more to it than you think by Nicole Forsgren, Margaret-Anne Storey, Chandra Madilla, Thomas Zimmerman, Brian Houck, and Jenna ButlerAbi’s summary of the SPACE paper Peggy’s talk, What Does Productivity Actually Mean for Developers? 
undefined
4 snips
Jan 11, 2023 • 44min

Spotify’s failed #SquadGoals | Jeremiah Lee (Spotify, Stripe)

This week’s guest is Jeremiah Lee, who was previously a manager at Stripe and product manager at Spotify. This conversation focuses on org structure, and specifically Jeremiah’s experience with the popular squad model from Spotify. Jeremiah provides the backstory on where the model came from, what parts of the model were a challenge, and advice for leaders either already adopting the model or considering doing so. Discussion points:(1:40) What the Spotify model is(4:39) Jeremiah’s impression of the Spotify model as he joined the company(7:29) Spotify’s progress in adopting the model as Jeremiah joined(9:55) Challenges with matrix management(12:02) The role of engineering managers (14:40) What the model was designed to solve (15:54) Good autonomy versus toxic autonomy (18:51) How Agile coaches were used at Spotify (21:39) Advice for teams who are struggling to implement the Spotify model(24:50) Advice for leaders who are starting to think about org design(27:30) How Stripe approached org structure (30:26) How org structure affects a platform team’s work (33:32) Tracking engineering org structures (36:02) Why the squad model became so popular(39:37) What the original authors may have felt about the popularity of the modelMentions and links: Follow Jeremiah on LinkedInJeremiah’s Spotify’s Failed #SquadGoalsThe original whitepaper on the Spotify model: Scaling Agile at SpotifyTeam Topologies by Matthew Skelton and Manuel PaisEssential Scrum by Kenneth S. Rubin
undefined
11 snips
Jan 4, 2023 • 53min

How much to invest in platform work | Jean-Michel Lemieux (Shopify, Atlassian)

Jean-Michel Lemieux, former CTO of Shopify and VP of Engineering at Atlassian, explains how to advocate for investing in platform work, which projects to fund, and what distinguishes a great platform leader. —Discussion points:(1:38) Jean-Michel’s definition of platform work (6:44) Why reliability, performance, and stability do fall within platform work (7:24) The consequences of lacking a product mindset in platform(9:20) Why and how to advocate for investing 50% of R&D spend in platform work (12:31) How Jean-Michel arrived at 50% as the percentage of R&D spend that should be allocated to platform (16:09) Jean-Michel’s experiences with different levels of investment in platform work (21:59) What percentage of platform investment should go towards keep the lights on work(24:01) Whether the allocation changes at different company stages(27:05) Why platform work is consistently underinvested in(29:00) Why having a platform team could be an anti-pattern(32:32) How to advocate for this work to leaders(35:35) What it looks like to over-invest in platform work (40:03) How to decide which initiatives to invest in(47:41) Making build vs buy decisions in platform work (49:58) What distinguishes a great platform leader —Mentions and links: Follow Jean-Michel Lemieux on LinkedIn and Twitter Abi’s post that sourced many of the questions discussed in this conversationJean-Michel’s book chapter on platform investmentsJean-Michel’s definition of what platform work is The podcast episode on what Shopify expects of managers 
undefined
4 snips
Dec 20, 2022 • 50min

Principles for driving adoption and platform team growth | Jonathan Biddle (Wayfair)

Jonathan Biddle, Director of Engineering Effectiveness at Wayfair, shares the story of how his team found repeat success and subsequently grew in size and scope. He shares lessons they’ve borrowed from startups, including understanding the adoption curve and knowing your core users, and offers advice for other platform teams looking to move to the next stage. —Discussion points:(01:15) How Jonathan moved into his role(05:30) Why Platforms teams are in a position of leverage, but also ambiguity(07:18) The initial work Jonathan’s team focused on(10:07) Creating transactional versus recurring value(11:36) The difference between startups and platform teams (14:12) Expanding the team’s scope and rebranding to Developer Acceleration(18:20) What drove the platform team’s success(21:05) Three adoption concepts to understand(24:41) Knowing your core customers(27:36) Adoption metrics and feedback gathering mechanisms(33:37) When to mandate adoption or rely on organic adoption(38:38) A story of when adoption fell short (45:35) Advice for how other teams can go from zero to one—Mentions and links: Follow Jonathan on LinkedInDiffusion of Innovations by Everett M. Rogers (and the Wikipedia page for the book)Crossing the Chasm by Geoffrey A. MooreLet My People Go Surfing by Yvon Chouinard of Patagonia
undefined
Dec 13, 2022 • 50min

Leading infrastructure change at scale | Ian White (DAT)

Ian White, Director of Platform Engineering at DAT, joined the company to scale their Kubernetes-based cloud infrastructure, which has come under stress as their business has grown over the past couple years. Here he shares how he partnered with developers to learn about their challenges, how we conveyed a vision for how the company needed to evolve, and how he’s been working with development teams and business stakeholders to successfully drive change.—(01:00) - The challenges DAT was facing as Ian joined (05:13) - How Ian used customer interviews to understand problems(10:48) - The typical journey companies take as they scale their infrastructure as they grow (16:20) - How early changes were positioned and received (20:00) - The four personas Ian identified (25:14) - How Ian evangelized the vision(28:48) - Areas of pushback Ian foresees as they introduce new changes(33:00) - Handling teams that want to stay on self-managed infrastructure instead of moving to a managed infrastructure (41:55) - Managing business stakeholders(45:00) - Partnering with finance —Where to find Ian:Follow Ian on LinkedIn
undefined
Dec 7, 2022 • 34min

Positioning platform work in a down market | Brian Guthrie (Orgspace, Meetup)

Brian Guthrie, co-founder and CTO at Orgspace and former VP of Engineering at Meetup, has the unique experience of having previously decommissioned his Platform team. In this episode, Brian talks about that story openly, and shares advice for Platform teams to make sure they’re well positioned within their organizations. Discussion points:Brian’s background and story at Meetup - [00:02:20]Brian’s perspective on Platform work, generally - [00:06:40]The conversation around dissolving the Platform group - [00:12:05]Advice for Platform groups positioning their teams - [00:16:55]Making sure Platform groups are focused on the right problems [00:21:21]How Platform groups can think about communicating with the business [00:23:50]Bringing engineering teams into the planning process - [00:25:43]Deciding to build vs buy in a down market - [00:28:40]How developer happiness is part of positioning platform work [00:32:30]Follow Brian: Brian's LinkedIn: https://www.linkedin.com/in/bguthrie/Mentions and links: Brian's talk, Is the optimal size of a platform team... zero?The Future of Ops is Platform Engineering by Charity MajorsFormer Shopify CTO's take on the optimal spend on platform workResearch on how developer happiness impacts productivity 

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app