Calculating the Computational Demands of Large Language Models

This chapter explores the extensive computational requirements for training a massive 70 billion parameter model using a 15 trillion token dataset. It highlights the inefficiencies encountered with H100 GPUs and emphasizes the significance of model FLOPs utilization during the training process.

Transcript

chevron_right

Play full episode

chevron_right

Transcript

Episode notes

Ever wondered what it really takes to train a massive AI model like the ones powering the latest tech? We move beyond speculation and get down to the numbers.

In this episode, we answer a very specific question: How long would it actually take to train a 70-billion parameter Large Language Model on a colossal 15-trillion token dataset using a supercomputer cluster of 1024 NVIDIA H100 GPUs?

Join us as we unpack this question and calculate the answer from two different angles:

🟦 The Top-Down Approach: Using real-world performance benchmarks published by NVIDIA.

🟦 The Bottom-Up Approach: Building a fundamental calculation from scratch based on total Floating-Point Operations (FLOPs) and system efficiency, also known as Model FLOPS Utilization (MFU).

Whether you're an AI practitioner, a tech enthusiast, or just curious about the scale of modern computation, this episode provides a concrete look at the time, resources, and complexity behind building state-of-the-art artificial intelligence.

Thank you for listening! ❤️

CONNECT WITH DMYTRO

🟦 LinkedIn: ⁠https://www.linkedin.com/in/dimanngo

🟦 Email: ⁠info@golodiuk.com⁠

EPISODE LINKS (ORIGINAL BLOG POSTS)

Find the full blog post and all the calculations here: How Long to Train a 70B LLM on 15T Tokens with 1024 H100s

This podcast episode is an AI-narrated version of the original text-based articles from Dmytro's personal blog, which you can find at ⁠⁠www.golodiuk.com/news⁠⁠

ABOUT Dmytro | ⁠⁠www.golodiuk.com⁠⁠

Dmytro Golodiuk is a highly experienced technology professional with over 17 years in the software industry. His proficiency spans cloud computing, enterprise platforms, software development, and integration technologies, with deep expertise in the Microsoft ecosystem. Dmytro combines his technical knowledge with formal Enterprise Architecture frameworks like TOGAF and ArchiMate to deliver robust and practical solutions.

In addition to his architectural work, Dmytro is a passionate mentor dedicated to helping others grow in their IT careers.

⁠⁠https://mentor.sh/mentors/dmytro_golodiuk⁠⁠

MENTORSHIP and WHAT I OFFER

🟦 A CLEAR ROADMAP: I'll help you forge the path from technical expertise to architectural vision. My focus isn't on specific technologies – you've got that covered. Instead, we'll concentrate on the strategic thinking, communication, and leadership skills that define a successful architect.

🟦 BRIDGING THE GAPS: Together, we'll identify and close the crucial gaps between a senior engineering role and the holistic view required of an architect.

🟦 FOSTERING YOUR GROWTH: My mentorship is about cultivating your ability to see the bigger picture, to design robust and effective solutions, and to communicate complex ideas with simplicity and impact.

🟦 ARCHITECT READY CV PROFILE OPTIMISATION: I'll help you transform your engineering CV into a strategic narrative that compellingly showcases your architectural potential, leadership, and strategic contributions to resonate powerfully with hiring managers.

🟦 ACE YOUR ARCHITECT INTERVIEW: I’ll prepare you for the full spectrum of interview scenarios.

IF YOU'RE A MID TO SENIOR ENGINEER WHO

🟦 Aspires to become a Solution Architect.

🟦 Recognizes the need to develop beyond deep technical skills.

🟦 Is ready to embrace the mindset and responsibilities of an architect.

✅ Then I'm the mentor you're looking for. Let's work together to unlock your potential and lay the bridge to your future as a Solution Architect.

⁠⁠https://mentor.sh/mentors/dmytro_golodiuk⁠⁠

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books