1min snip

This Week in Startups cover image

How open-source & distributed models can win AI with MosaicML’s Naveen Rao | E1754

This Week in Startups

NOTE

Next Steps in Using Your Startup Data for AI

The next steps for getting value from your startup data depend on the amount of data you have. If you have less than 100,000 words, you can use prompts into tokens. For data in the range of hundred million, fine tuning can be considered. And for data in the billion range, pre-training and layering in the data can be done. The method you choose depends on how much data you have. In the example given of 5,000 transcripts, the suggested approach would be prompts into tokens, with the possibility of using a light fine tuning to condition the model for specific outputs.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode