Exploring how the team transitioned to automation for deploys, focusing on the benefits of automating deployments to enhance deploy safety work and promote the use of rollbacks over hot fixes for timely issue resolutions. The chapter also addresses challenges in providing developers with essential deploy information while maintaining a balance between informative context and non-intrusive communication.
Click here to view the episode transcript.
This week we’re joined by Sean Mcllroy from Slack’s Release Engineering team to learn about how they’ve fully automated their deployment process. This conversation covers Slack’s original release process, key changes Sean’s team has made, and the latest challenges they’re working on today.
Mentions and links:
Time Stamps:
- (1:34): The Release Engineering team
- (2:13): How the monolith has served Slack
- (3:24): How the deployment process used to work
- (6:23): The complexity of the deploy itself
- (7:39): Early ideas for improving the deployment process
- (9:07): Why anomaly detection is challenging
- (10:32): What a Z-score is
- (13:23): Managing noise with Z-scores
- (16:49): Presenting this information to people that need it
- (19:54): Taking humans out of the process
- (23:13): Handling rollbacks
- (25:27): Not overloading developers with information
- (28:26): Handling large deployments