

This is Fine! A podcast about resilience engineering and software
Colette Alexander and Clint Byrum
A podcast about resilience engineering and software.
Ever wondered why things on the internet break? Do you work in software and wish that you could have a Dear-Abby-Like call-in show that could answer your deepest questions about how to make your workplace suck less? We're here to help!
Write us anonymously at our open question form
Email us at: thisisfine.softwarepodcast@gmail.com
Call us and leave a voicemail, or text us at: (401) 592-7574
Ever wondered why things on the internet break? Do you work in software and wish that you could have a Dear-Abby-Like call-in show that could answer your deepest questions about how to make your workplace suck less? We're here to help!
Write us anonymously at our open question form
Email us at: thisisfine.softwarepodcast@gmail.com
Call us and leave a voicemail, or text us at: (401) 592-7574
Episodes
Mentioned books

Feb 20, 2025 • 52min
Episode 10 - When They go Full ITIL on You w/special guest john allspaw
You can find John at Adaptive Capacity Labs or his (old) blog at Kitchen Soap. ITIL is… well, it’s a thing.Colette’s “You’re surprised it works in the first place” comes from Richard Cook’s brilliant Velocity talk in 2013.FYI, John wasn’t talking about Franz Kafka, we think he was talking about Apache Kafka. But they are pretty similar, we think.

Feb 12, 2025 • 49min
Episode 9 - Learning from Incidents with special guest Alex Elman
You can find ACL (Adaptive Capacity Labs), the folks who train software engineers how to do LFI and who we speak so fondly of here.Colette mentioned Allspaw’s take on Five Whys - if you want to know why we think there are better options for learning out there, you can read it here.Alex did a great talk with Sarah Butt on some LFI related things at LFI Conf in 2023: https://www.youtube.com/watch?v=CbSiKAtO7FkAnd at SRECon: SREcon20 Americas - Are We Getting Better Yet? Progress Toward Safer OperationsColette went to go see whales in the Baja through this tour, it was awesomeWrite to us at thisisfine.softwarepodcast@gmail.com or go fill out our form with a question at Thisisfinepod.com

Jan 29, 2025 • 37min
Episode 8 - Why Human Factors and Not Technical Ones
The spicy Allspaw take that inspired our listener is here: https://www.linkedin.com/posts/jallspaw_a-im-a-bit-salty-today-b-if-you-dont-activity-7287968197742411776-5_Ay Charles Perrow is the guy who wrote Normal Accidents (https://bookshop.org/p/books/normal-accidents-living-with-high-risk-technologies-updated-edition-revised-charles-perrow/10369279?ean=9780691004129&next=t&next=t) , which Colette is somewhat controversially a fan of, and thus a Perrow-ian? (a lot of resilience engineering people are not fans!) Not many notes today, here go check out a page on one of Colette’s favorite chicken breeds: https://greenfirefarms.com/shetland_hen.html

Jan 22, 2025 • 54min
Episode 7 - AI and Resilience with special guest Courtney Nash
The VOID is one of our favorite things!Some of Courtney’s inoculation of the MTTR virus can be found here:An interview with InfoQA talk at SRE Con Americas in 2022Courtney’s recent talk on Automation and AIDavid Graeber’s Bullshit Jobs started as a talk and then a great bookWant to read more about HABA-MABA and CSE/RE? Lisanne Bainbridge’s The Ironies of Automation is a perennial recommendation in our show notesThe thread Courtney mentioned from Gergely Orosz

Jan 8, 2025 • 56min
Episode 6 - Can You Buy Resilience? With Special Guest Steve McGhee
Steve is the host of the Google SRE Prodcast, you should check it out!Colette got her chickens from Greenfire Farms, and her chicken coop from Carolina Coops, if anyone is wondering.The Chris Hayes podcast Colette mentioned about unconditional cash transfers is here.Iain M. Banks is an author of The Culture series, a set of fiction books based in a post-scarcity societyIf you didn’t get the Vizzini/Inigo Montoya references, you should probably find a way to see The Princess Bride.Colette mentioned STAMP - which is more along the lines of reliability engineering than resilience engineering, technically, but is related. You can read about how Google is using it here.Lord, you want the history of ITIL? Okay.**** note, none of the below sponsor us (yet), so these are pure-hearted endorsements from Clint during the episode ****Adaptive Capacity Labs will teach your teams how to be more resilient.Incident.io is who Clint mentioned as one of the many incident automation tools out there (Rootly and FireHydrant are a couple others).Backstage is an open source Spotify product, and anyone who’s worked at Spotify will talk your ear off about how great it is if you let us.*************************A new Resilience Engineering community that Colette and Clint are a part of has launched! You can find us at resilienceinsoftware.org and join to be a part of the conversation in SlackAnd of course, you can email us at thisisfine.softwarepodcast@gmail.com or write to us via http://thisisfinepod.com

15 snips
Dec 22, 2024 • 0sec
episode 5 - curating your resilience engineering 101
Dive into the intriguing world of resilience engineering, where insights from skiing mishaps lead to a discussion on complex system failures. Explore the evolution from Safety One to Safety Two, emphasizing learning cultures and practical safety measures. The hosts critique resource challenges in the field, advocating for concise guides over vague narratives. They also tackle the pitfalls of unrealistic safety expectations, using real-world examples like the Exxon Valdez spill to highlight the gaps between planning and reality.

12 snips
Dec 11, 2024 • 47min
Episode 4 - A look at the 2024 dora report
Fred Hebert, Staff SRE at Honeycomb.io and technical author, brings a wealth of knowledge in resilience and distributed systems. He discusses the impact of the DORA report on software engineering and workplace culture. The conversation delves into the nuances of burnout, emphasizing self-care amidst rapid technological change. They analyze the complexities of AI adoption in the workplace, highlighting trust issues and leadership styles that can enhance employee well-being. Fred also critiques how corporate interests can skew data interpretation in assessing productivity metrics.

4 snips
Dec 4, 2024 • 31min
Episode 3 - lions, tigers and metrics, oh my!
Vanessa Huerta Granda, a technology manager passionate about resilience engineering, shares her insights on navigating metrics in incident management. She discusses the challenges of code freezes and the importance of adaptable metrics. Vanessa emphasizes the significance of context when analyzing Mean Time to Recovery (MTTR) and how it can lead to meaningful insights. The conversation also highlights the necessity for better communication between tech teams and executives to ensure effective decision-making based on accurate data.

14 snips
Nov 21, 2024 • 0sec
Episode 2 - Does Software Need Safety?
John Allspaw, a pioneer in resilience engineering known for his impactful work at Etsy and in the DevOps movement, dives into crucial discussions about safety in software. He addresses how traditional safety concepts clash with software development realities. The conversation highlights the necessity of psychological safety for innovation and explores narrative control's role in software perception. They also examine the dynamics of change management in development, including the risks associated with code freezes and fostering open communication during deployments.

7 snips
Nov 7, 2024 • 35min
Episode 1 - Every Second Counts
In this engaging discussion, the hosts dive into the world of resilience engineering, sharing personal anecdotes that highlight its significance. They tackle the complexities of maintaining uptime and the common misconceptions surrounding it. The conversation turns to navigating job loss, where the importance of empathy and communication shines through during crises. There's also a call for community involvement, inviting listeners to join the conversation and share their experiences in fostering resilience in the workplace.