Liz Fong-Jones, co-author of the "Observability Engineering" book and Principal Developer Advocate for SRE and Observability at Honeycomb, discusses observability and its importance in the industry. She explains the core analysis loop, cardinality, and dimensionality, and the concept of debugging from first principles. She also talks about observability-driven development and the observability maturity model. Other topics include implementing observability, the challenges of understanding complex cloud-native microservices, and the importance of social alignment in observability.
Observability allows understanding of novel problems without needing new code and embraces the core analysis loop.
Implementing observability-driven development, shifting left for observability, and aligning people's understanding are key techniques for successful observability.
Deep dives
Understanding the Essence of Observability
Observability ensures that you can comprehend novel problems in your system without requiring new code. It allows you to quickly understand what is happening and why by analyzing existing data from telemetry signals. Observability enables you to debug from first principles, forming and testing hypotheses rapidly to pinpoint issues. It embraces the concept of the core analysis loop, where you compare normal and abnormal requests, correlate events, and visualize data to identify triggers. With high cardinality and dimensionality, observability offers deep introspection into the internal state of your system, distinguishing it from traditional monitoring. It is crucial in cloud-native, microservices environments and supports DevOps and SRE practices. Implementing observability-driven development and shifting left for observability are key techniques for success.
Implementing Observability in Your Systems
To implement observability, start by adopting the open telemetry SDK, providing a common language for generating and transmitting telemetry data. Consider adding instrumentation to your code, both auto-instrumentation and conscious decisions made by developers. Choose a suitable vendor solution or open-source tool that supports open telemetry and aligns with your evaluation criteria. Continuously improve your software delivery cycle, ensuring fast and frequent deployments. Focus on developing and running systems, driving collaboration and ownership among your team. Prioritize user analytics alongside operations and resilience workflows. Manage technical debt to maintain system health. Embrace observability as a socio-technical capability that aligns people and tools for successful software development and operations.
Evaluating Maturity and Continuous Improvement
Assessing observability maturity involves evaluating five critical areas: observability-driven development, continuous delivery, resilience workflows, user analytics, and managing technical debt. Mature observability practices foster observability-driven development, enabling code that works right the first time and exercising instrumentation during testing. Continuous delivery empowers rapid and frequent deployments, reducing barriers to change and ensuring observability is embedded in the process. Building resilience workflows helps to actively understand and improve system performance, addressing issues within half an hour. User analytics provide insights into user experiences and guide product development. Lastly, managing technical debt to keep systems adaptable and maintainable is essential. Continuously assess your observability practices, progress, and align with the observability maturity model to drive continuous improvement.
The Social Importance of Observability
While tools and technologies are integral to observability, focusing on the social aspect is paramount. Effective observability requires aligning people's understanding, goals, and outcomes. Prioritize communication, collaboration, and shared working models to ensure everyone is on the same page. Don't just introduce new tools, but instead, align people on desired outcomes and work methodologies. When people are aligned, tools become more effective, and observability practices thrive. Remember that software delivery is primarily about people, not just technology. Emphasize social cohesion, empathy, and shared responsibility to achieve successful observability and drive positive impact in your organization.
“Observability is a technique for ensuring that you can understand novel problems in your system. Can you understand what’s happening in your system and why, without having to push a new code by slicing and dicing existing telemetry signals that are coming out of your system?"
Liz Fong-Jones is the co-author of the “Observability Engineering” book and a Principal Developer Advocate for SRE and Observability at Honeycomb. In this episode, Liz shared in-depth about observability and why it is becoming an important practice in the industry nowadays. Liz started by explaining the fundamentals of observability and how it differs from traditional monitoring. She explained some important concepts, such as the core analysis loop, cardinality and dimensionality, and doing debugging from a first principle. Later, Liz shared the current state of observability and how we can improve our observability by doing observability driven development and improving our practices based on the proposed observability maturity model found in the book.
Listen out for:
Career Journey - [00:05:44]
Observability - [00:06:30]
Pillars of Observability - [00:09:57]
Monitoring and SLO - [00:12:28]
Core Analysis Loop - [00:15:06]
Cardinality and Dimensionality - [00:18:41]
Debugging from First Principle - [00:21:20]
Current State of Observability - [00:26:49]
Implementing Observability - [00:30:20]
Observability Driven Development - [00:36:53]
Having Developers On-Call - [00:39:06]
Observability Maturity Model - [00:41:59]
3 Tech Lead Wisdom - [00:44:10]
_____
Liz Fong-Jones’s Bio
Liz is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with 17+ years of experience. She is an advocate at Honeycomb for the SRE and Observability communities, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights. She lives in Vancouver, BC with her wife Elly, partners, and a Samoyed/Golden Retriever mix, and in Sydney, NSW. She plays classical piano, leads an EVE Online alliance, and advocates for transgender rights.
Today’s episode is proudly sponsored by Skills Matter, the global community and events platform for software professionals.
Skills Matter is an easier way for technologists to grow their careers by connecting you and your peers with the best-in-class tech industry experts and communities. You get on-demand access to their latest content, thought leadership insights as well as the exciting schedule of tech events running across all time zones.
Head on over to skillsmatter.com to become part of the tech community that matters most to you - it’s free to join and easy to keep up with the latest tech trends.
Like this episode? Subscribe on your favorite podcast app and submit your feedback.
Follow @techleadjournal on LinkedIn, Twitter, and Instagram.
Pledge your support by becoming a patron.
For more info about the episode (including quotes and transcript), visit techleadjournal.dev/episodes/88.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode