
Day Two DevOps
D2DO252: (Re)Building Cloudflare’s Millions-of-Logs-Per-Second Logging Pipeline
Oct 2, 2024
Colin Douch, Observability Tech Lead at Cloudflare, and Jayson Cena, SRE at Cloudflare, dive into the complexities of migrating from Syslog-NG to OpenTelemetry. They discuss the motivations for this shift, such as scalability and memory safety, while tackling challenges like maintaining uninterrupted customer traffic. The duo also highlights the importance of redundancy in logging systems and shares insights on logging protocols, illustrating the balance between resource usage and operational speed in a high-performance environment.
37:53
Episode guests
AI Summary
AI Chapters
Episode notes
Podcast summary created with Snipd AI
Quick takeaways
- Cloudflare's migration to OpenTelemetry significantly enhances its logging capabilities by improving scalability, performance, and maintainability for handling millions of logs per second.
- The successful deployment of OpenTelemetry required meticulous planning to ensure uninterrupted customer traffic, showcasing the importance of operational efficiency during major transitions.
Deep dives
Importance of Managing Unmanaged Devices and Apps
Companies face significant challenges in securing data when employees use unmanaged devices and non-approved applications. Traditional identity and access management (IAM) and mobile device management (MDM) solutions often fall short in addressing these security gaps. It is essential to implement strategies that extend beyond conventional methods to safeguard sensitive information effectively. Organizations need to prioritize solutions that cater to the complexities of modern work environments.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.