Estimating the size of GPT-4, I assume it to be 10 to the power of 25 in terms of flops. This is based on the recent declaration of a reporting threshold at 10 to the power of 26. Additionally, I consider the device flops to be four times 10 to the power of 15, assuming eight-digit quantization in training. To estimate the time for GPT-4, I refer to a source suggesting it took approximately 30,000 A100s for three to five months. While the specifics are unclear, this estimate aligns with the number of parameters and experts in the model.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode