The hosts discuss debugging techniques, with Bill preferring production tools over debuggers. Matt provides counterpoints. They explore coding style guidelines, challenges with PagerDuty, importance of logging, and evolution of developer tools. The podcast also touches on human behavior in relation to app icon color and organization, as well as app installations and travel perspectives.
Developers should prioritize adding logs over debugger usage to fix underlying issues.
Different debugging tools and approaches are needed for local vs. production environments.
Logging acts as an insurance policy to track down bugs, needing a balance to provide actionable insights.
Metrics play a crucial role in monitoring system behaviors, requiring clear signal-to-noise ratios for valuable insights.
Deep dives
Philosophy on Debugging
There is a discussion about the differing philosophies on using debuggers. While Matt tends to lean towards using debuggers for building mental models with unfamiliar code bases, Bill emphasizes a 20-minute rule before allowing debugger use. He advocates for prioritizing adding logs over debugger usage and encourages fixing underlying issues rather than relying solely on debugging tools.
Local vs. Production Debugging
The conversation transitions to discussing the differences between debugging locally and in production environments. Matt and Bill agree that local debugging tools and approaches differ from those used in production. Bill's rule emphasizes that if a tool cannot be used in production to identify a bug, it should not be used during local development. Matt highlights the importance of process and discipline in going back to fix issues found during debugging to prevent future occurrences.
The Role of Logging
Bill and Matt delve into the significance of logging in debugging processes. Bill stresses the importance of logging as an insurance policy to track down bugs, offering insights from his experience. Matt explains how logging can lead to a high signal-to-noise ratio, detailing the manageable retention period of logs based on traffic volume. They caution against excessive logging and promote maintaining a balance to ensure logs provide actionable insights.
Metrics for Monitoring
The discussion shifts to the value of using metrics for monitoring and detecting patterns in system behaviors. Matt advocates for leveraging metrics to identify patterns like transaction failures with various levels of cardinality. Bill raises concerns about metric dashboards with flashiness over usefulness, emphasizing the importance of metrics having a clear signal-to-noise ratio to provide valuable insights.
Resource Management and Actionable Insights
Matt shares instances where CPU graphs signaled the need for immediate action in resource management. He discusses incidents where spikes in CPU usage led to performance degradation, necessitating decisions to scale systems or prioritize job queues. Matt's examples illustrate the practical application of metrics and CPU monitoring to address system contingencies and ensure optimal performance.
Efficient Web Scraping and Information Dissemination
The episode discusses the inefficiency of web scraping bots constantly searching for changes in website content for search engine rating. Instead, the solution involved pushing real-time information updates to the bots. This approach aimed to improve efficiency by sending updates directly, reducing unnecessary scraping activities.
Optimizing Workload and Resource Usage for Search Engine Updates
The discussion delves into the challenges of managing workload spikes when sending updates to search engines. The process involved using Redis to store information temporarily due to rate limits. To address fluctuating CPU demands, the focus shifted to tuning workloads efficiently without relying solely on horizontal pod autoscaling, emphasizing the balance between resource availability and workload demands.
In this episode Matt, Bill & Jon discuss various debugging techniques for use in both production and development. Bill explains why he doesn’t like his developers to use the debugger and how he prefers to only use techniques available in production. Matt expresses a few counterpoints based on his different experiences, and then the group goes over some techniques for debugging in production.
Changelog++ members save 4 minutes on this episode because they made the ads disappear. Join today!
Sponsors:
FireHydrant – The alerting and on-call tool designed for humans, not systems. Signals puts teams at the center, giving you ultimate control over rules, policies, and schedules. No need to configure your services or do wonky work-arounds. Signals filters out the noise, alerting you only on what matters. Manage coverage requests and on-call notifications effortlessly within Slack. But here’s the game-changer…Signals natively integrates with FireHydrant’s full incident management suite, so as soon as you’re alerted you can seamlessly kickoff and manage your entire incident inside a single platform. Learn more or switch today at firehydrant.com/signals
Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.