6min chapter

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0 cover image

[Practical AI] AI Trends: a Latent Space x Practical AI crossover pod!

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

CHAPTER

How to Mix Up Public Data to Make Your Model Better

I was thinking about unable datasets for unsupervised learning or self supervised learning, right? Like that is something that we are trying to grab our heads around like common crawl, stack overflow archive, the books. Nobody has a street answer as to how what the data mixes and everyone's just kind of experiments. I get the sense that open AI doesn't want to encourage that anymore. They don't have fine tuning for 3.5 and 4. So yeah, I would encourage people specifically on this topic to maybe give some models a go under the hood and also try for different, gain your own intuition about how it might change based on the mix of data. We have two questions for you

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode