How big is GPT-4 and how long does it take to train?

1min Snip

00:00

Play full episode

Summary

Transcript

Episode notes

Estimating the size of GPT-4, I assume it to be 10 to the power of 25 in terms of flops. This is based on the recent declaration of a reporting threshold at 10 to the power of 26. Additionally, I consider the device flops to be four times 10 to the power of 15, assuming eight-digit quantization in training. To estimate the time for GPT-4, I refer to a source suggesting it took approximately 30,000 A100s for three to five months. While the specifics are unclear, this estimate aligns with the number of parameters and experts in the model.

In this episode, Nathan chats with Josh Albrecht, CTO of Imbue. They discuss how to create agents for reasoning, reliability, and robustness. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period.

We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com.

SPONSORS:

Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive

Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

X/SOCIAL

@labenz (Nathan)

@eriktorenberg (Erik)

@CogRev_Podcast

TIMESTAMPS:

(00:00:00) – Episode Preview

(00:07:14) – What does it mean to be a research company?

(00:10:25) – How is the reasoning landscape these days and how might it evolve?

(00:11:03) – Data quality is highly important

(00:21:15) – What’s the difference between good features and a good world model?

(00:27:31) – The impact of new modalities on reasoning

(00:29:15) – How much can reasoning and knowledge be separated?

(00:45:13) – Imbue demo and are they building their own LLMs or using others?

(00:49:37) – Does Imbue have a deal with Nvidia?

(00:57:48) – Carbs framework

(01:12:57) – Imbue’s involvement with policy and and AI safety

(01:16:23) – Takeaways from AI Safety Summit and Biden’s Order

This show is produced by Turpentine: a network of podcasts, newsletters, and more, covering technology, business, and culture — all from the perspective of industry insiders and experts. We’re launching new shows every week, and we’re looking for industry-leading sponsors — if you think that might be you and your company, email us at erik@turpentine.co.