Ever wondered what it really takes to train a massive AI model like the ones powering the latest tech? We move beyond speculation and get down to the numbers.
In this episode, we answer a very specific question: How long would it actually take to train a 70-billion parameter Large Language Model on a colossal 15-trillion token dataset using a supercomputer cluster of 1024 NVIDIA H100 GPUs?
Join us as we unpack this question and calculate the answer from two different angles:
🟦 The Top-Down Approach: Using real-world performance benchmarks published by NVIDIA.
🟦 The Bottom-Up Approach: Building a fundamental calculation from scratch based on total Floating-Point Operations (FLOPs) and system efficiency, also known as Model FLOPS Utilization (MFU).
Whether you're an AI practitioner, a tech enthusiast, or just curious about the scale of modern computation, this episode provides a concrete look at the time, resources, and complexity behind building state-of-the-art artificial intelligence.
Thank you for listening! ❤️
CONNECT WITH DMYTRO
🟦 LinkedIn: https://www.linkedin.com/in/dimanngo
🟦 Email: info@golodiuk.com
EPISODE LINKS (ORIGINAL BLOG POSTS)
Find the full blog post and all the calculations here: How Long to Train a 70B LLM on 15T Tokens with 1024 H100s
This podcast episode is an AI-narrated version of the original text-based articles from Dmytro's personal blog, which you can find at www.golodiuk.com/news
ABOUT Dmytro | www.golodiuk.com
Dmytro Golodiuk is a highly experienced technology professional with over 17 years in the software industry. His proficiency spans cloud computing, enterprise platforms, software development, and integration technologies, with deep expertise in the Microsoft ecosystem. Dmytro combines his technical knowledge with formal Enterprise Architecture frameworks like TOGAF and ArchiMate to deliver robust and practical solutions.
In addition to his architectural work, Dmytro is a passionate mentor dedicated to helping others grow in their IT careers.
https://mentor.sh/mentors/dmytro_golodiuk
MENTORSHIP and WHAT I OFFER
🟦 A CLEAR ROADMAP: I'll help you forge the path from technical expertise to architectural vision. My focus isn't on specific technologies – you've got that covered. Instead, we'll concentrate on the strategic thinking, communication, and leadership skills that define a successful architect.
🟦 BRIDGING THE GAPS: Together, we'll identify and close the crucial gaps between a senior engineering role and the holistic view required of an architect.
🟦 FOSTERING YOUR GROWTH: My mentorship is about cultivating your ability to see the bigger picture, to design robust and effective solutions, and to communicate complex ideas with simplicity and impact.
🟦 ARCHITECT READY CV PROFILE OPTIMISATION: I'll help you transform your engineering CV into a strategic narrative that compellingly showcases your architectural potential, leadership, and strategic contributions to resonate powerfully with hiring managers.
🟦 ACE YOUR ARCHITECT INTERVIEW: I’ll prepare you for the full spectrum of interview scenarios.
IF YOU'RE A MID TO SENIOR ENGINEER WHO
🟦 Aspires to become a Solution Architect.
🟦 Recognizes the need to develop beyond deep technical skills.
🟦 Is ready to embrace the mindset and responsibilities of an architect.
✅ Then I'm the mentor you're looking for. Let's work together to unlock your potential and lay the bridge to your future as a Solution Architect.
https://mentor.sh/mentors/dmytro_golodiuk