Ship It! Cloud, SRE, Platform Engineering cover image

Managing Meta's millions of machines

Ship It! Cloud, SRE, Platform Engineering

00:00

Navigating Open Source Contributions and Managing AI Fleet at Meta

The chapter explores the team's approach to contributing to open source projects with an upstream first mindset, balancing upstream contributions with maintaining their own software. It delves into the challenges of managing a large number of hosts, major upgrades versus rolling OS updates, and optimizing Meta's extensive AI fleet. Additionally, the conversation focuses on the specialization within Meta's infrastructure teams, their on-prem setup, and upcoming projects involving the utilization of System D.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app