This Week in Startups cover image

How open-source & distributed models can win AI with MosaicML’s Naveen Rao | E1754

This Week in Startups

NOTE

Next Steps in Using Your Startup Data for AI

The next steps for getting value from your startup data depend on the amount of data you have. If you have less than 100,000 words, you can use prompts into tokens. For data in the range of hundred million, fine tuning can be considered. And for data in the billion range, pre-training and layering in the data can be done. The method you choose depends on how much data you have. In the example given of 5,000 transcripts, the suggested approach would be prompts into tokens, with the possibility of using a light fine tuning to condition the model for specific outputs.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner