3min chapter

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0 cover image

Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

CHAPTER

How to Train a Large Model on Multiple Epochs

Jod was like no let's just train it on all the data how many tokens do we have we tell him and he's like that's not enough where can we get more tokens okay. So mikele had this you know great idea to to train it on multiple epochs and so re-sampling the same data again yeah which can be risky or like or tens of thousands of times. We did that in in a matter of like two days for for the replet data as well removed any of any person any pii like personal information remove harmful content removing any of that stuff, then trained on top of that. i believe that replet tuned data has seen something like 680

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode