Charity Majors, Co-founder of Honeycomb, talks about observability engineering, the evolution of telemetry, benefits of high cardinality in observability, effectively communicating the value of engineering work, and challenges in instrumentation and spans.
Observability engineering goes beyond traditional metrics and focuses on wide events and spans to provide context and explore the internal state for effective debugging and data-driven decision-making.
Implementing observability may come with challenges such as high storage costs and complexity; however, starting with pain points, using smart sampling techniques, and leveraging available tools can help overcome these obstacles.
Proper instrumentation and tracing are crucial for observability, capturing data at various points in the system to understand flow and behavior, with intelligent sampling and following conventions ensuring accurate and meaningful data.
Deep dives
Observability Engineering: Understanding Your Systems
Observability engineering is about understanding what's going on inside your systems by observing the outputs. It goes beyond traditional metrics and focuses on wide events and spans, which provide context and allow you to explore the internal state. This approach helps you understand the behavior of your software, debug issues more effectively, and make data-driven decisions. Implementing observability can start with identifying painful points in your system, such as frequent outages or performance bottlenecks. Introduce intelligent sampling to reduce the amount of data collected and prioritize what matters most. Finally, emphasize the importance of understanding the value and impact of your software on your customers, as this forms the foundation of observability.
Overcoming Challenges of Observability
Implementing observability may come with challenges, such as the potentially high cost of storing large amounts of data. To address this, it is recommended to use smart sampling techniques that prioritize capturing important events and discarding less relevant ones. Additionally, the complexity of setting up observability can be daunting. However, focusing on pain points, starting with one important aspect such as the CI/CD pipeline, and leveraging available tools like open telemetry can help overcome these challenges. It is crucial to emphasize the need for observing meaningful data and ensuring that observability efforts align with the goals and values of the organization.
Instrumentation and Tracing in Observability
Proper instrumentation and tracing are key elements of observability. They involve capturing data at various points in the system and creating traceable events or spans to understand the flow and behavior of requests. Instrumentation allows you to observe what's happening inside your code, while tracing provides a detailed view of the entire request lifecycle across different services. The challenge lies in finding the right balance in capturing events and spans without overwhelming the system. Adopting intelligent sampling techniques and following conventions can help in ensuring accurate and meaningful observability data.
Rolling Out Observability in your Organization
Introducing observability in your organization can be a gradual process. Starting with a pain point or an area of improvement can help demonstrate the value of observability to the team. This can involve targeting issues like frequent outages, performance bottlenecks, or reliability challenges. Prioritizing instrumentation and establishing an instrument-first mentality during incident response can help drive the adoption of observability practices. It is essential to communicate the benefits of observability and involve stakeholders early on to ensure a smooth rollout and foster a culture that embraces observability as a valuable tool.
The Future of Observability
The future of observability lies in the collaboration between humans and machines. While advancements in AI and machine learning can assist in data analysis, decision-making, and anomaly detection, the interpretation and understanding of observability data remain within the realm of human expertise. Rather than relying solely on machines to provide insights, it is important to continue to focus on the human aspect of observability, connecting the dots, and making meaningful decisions based on the data. As observability evolves, emphasis should be placed on value, purpose, and customer impact, driving engineers to better understand and improve the systems they build.
What is observability engineering, and why do you need some? While at NDC in Porto, Carl and Richard recorded a .NET Rocks Live with Charity Majors, one of the founders of Honeycomb. Charity talked about her experiences trying to understand how complex applications worked and failed at scale over her years of experience at Facebook and other companies. Ultimately, those experiences led to a book and the creation of Honeycomb. Lots of fun insight from someone who has fought the good fight - and some great questions from the audience!
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.