Next Steps in Using Your Startup Data for AI

1min Snip

00:00

Play full episode

Summary

Comments

Transcript

Episode notes

The next steps for getting value from your startup data depend on the amount of data you have. If you have less than 100,000 words, you can use prompts into tokens. For data in the range of hundred million, fine tuning can be considered. And for data in the billion range, pre-training and layering in the data can be done. The method you choose depends on how much data you have. In the example given of 5,000 transcripts, the suggested approach would be prompts into tokens, with the possibility of using a light fine tuning to condition the model for specific outputs.

This Week in Startups is presented by:

Vanta. Compliance and security shouldn't be a deal-breaker for startups to win new business. Vanta makes it easy for companies to get a SOC 2 report fast. TWiST listeners can get $1,000 off for a limited time at vanta.com/twist.

Trovata. Starting up is hard. Trovata makes managing cash easy. Start automating your cash management at Trovata.io/TWIST. Use Code TWIST for 30% off one full year of premium features like AI forecasting.

The Microsoft for Startups Founders Hub helps all founders build a better startup, at a lower cost, from day one. Startups get up to $150K in Azure credits, access to free OpenAI credits, free dev tools like GitHub, technical advisory, access to mentors and experts, and so much more. There is no funding requirement, and it only takes minutes to join. Sign up today at aka.ms/thisweekinstartups

Todays show:

MosaicML Co-Founder and CEO Naveen Rao joins Jason to discuss the open-source vs closed AI debate, the profound impact of AI on society (41:06) AI’s rapid pace of change, and its implications for the future of employment and education (40:42). They wrap the show by breaking down the potential problems with centralized regulation (54:37).

Follow Naveen: https://twitter.com/NaveenGRao

Check Out MosaicML: https://mosaicml.com

Time stamps:

(00:00) Naveen Rao joins Jason

(2:54) MosaicML and its purpose

(5:10) Obtaining datasets and incentivizing creators

(8:30) Vanta - Get $1000 off your SOC 2 at https://vanta.com/twist

(9:37) The process of using your data with MosaicML

(11:55) Defining tokens and prompts

(16:53) Fine-tuning the AI model and reinforcement learning

(19:27) The competition with open-source models

(24:26) The cost of running AI models

(26:08) Trovata - Use code TWIST at https://trovata.io/twist for 30% off one year of premium features, like AI forecasting

(27:35) How the GPU crunch has affected cloud models

(32:13) Why demand will not cease

(34:21) Specialized models vs. general models (39:12) Microsoft for Startups Founders Hub - Apply in 5 minutes for six figures in discounts at http://aka.ms/thisweekinstartups

(40:42) The impact AI will have on employment

(48:49) The impact AI will have on education

(54:37) Thoughts on OpenAI becoming ClosedAI

Read LAUNCH Fund 4 Deal Memo & Apply for Funding

Buy ANGEL

Great recent interviews: Brian Chesky, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarland, PrayingForExits, Jenny Lefcourt

Check out Jason’s suite of newsletters: https://substack.com/@calacanis

Follow Jason:

Twitter: https://twitter.com/jason

Instagram: https://www.instagram.com/jason

LinkedIn: https://www.linkedin.com/in/jasoncalacanis

Follow TWiST: