

#398: [What's New with AWS Well-Architected #2] Operational Excellence
Oct 13, 2020
Brian Carlson, the Global Operational Excellence Lead for the AWS Well-Architected program, shares invaluable insights on enhancing operational practices in cloud architecture. He discusses the importance of learning from mistakes, including a personal story about a significant outage caused by inadequate load testing. The conversation highlights how collaboration and a no-blame culture empower teams for success. Carlson also explains the evolution of the Well-Architected Framework and its role in fostering continuous improvement in cloud operations.
AI Snips
Chapters
Transcript
Episode notes
Outage Story
- Brian Carlson caused an outage affecting one-seventh of the US by implementing guaranteed syslog without proper load testing.
- A race condition occurred, preventing new sessions, but thankfully, the change was easily rolled back.
Well-Architected Framework
- The Well-Architected Framework guides customers in building effective cloud architectures by understanding operational risks.
- It helps those whose strength lies in application creation rather than operations.
Focus on Business Outcomes
- Focus on business and organizational outcomes in operational excellence.
- Ensure that business, development, and operations teams collaborate effectively.