Day Two DevOps cover image

Day Two DevOps

D2DO252: (Re)Building Cloudflare’s Millions-of-Logs-Per-Second Logging Pipeline

Oct 2, 2024
Colin Douch, Observability Tech Lead at Cloudflare, and Jayson Cena, SRE at Cloudflare, dive into the complexities of migrating from Syslog-NG to OpenTelemetry. They discuss the motivations for this shift, such as scalability and memory safety, while tackling challenges like maintaining uninterrupted customer traffic. The duo also highlights the importance of redundancy in logging systems and shares insights on logging protocols, illustrating the balance between resource usage and operational speed in a high-performance environment.
37:53

Podcast summary created with Snipd AI

Quick takeaways

  • Cloudflare's migration to OpenTelemetry significantly enhances its logging capabilities by improving scalability, performance, and maintainability for handling millions of logs per second.
  • The successful deployment of OpenTelemetry required meticulous planning to ensure uninterrupted customer traffic, showcasing the importance of operational efficiency during major transitions.

Deep dives

Importance of Managing Unmanaged Devices and Apps

Companies face significant challenges in securing data when employees use unmanaged devices and non-approved applications. Traditional identity and access management (IAM) and mobile device management (MDM) solutions often fall short in addressing these security gaps. It is essential to implement strategies that extend beyond conventional methods to safeguard sensitive information effectively. Organizations need to prioritize solutions that cater to the complexities of modern work environments.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner