If you have 30 dollars, a few hours, and one server, then you are ready to create a ChatGPT-like model that can do what’s known as instruction-following. Databricks’ latest launch, Dolly, foreshadows a potential move in the industry toward smaller and more accessible but extremely capable AIs. Plus, Dolly is open source, requires less computing power, and fewer data parameters than its counterparts.
Matei Zaharia, Cofounder & Chief Technologist at Databricks, joins Sarah and Elad to talk about how big data sets actually need to be, why manual annotation is becoming less necessary to train some models, and how he went from a Berkeley PhD student with a little project called Spark to the founder of a company that is now critical data infrastructure that’s increasingly moving into AI.
No Priors is now on YouTube! Subscribe to the channel on YouTube and like this episode.
Show Links:
Sign up for new podcasts every week. Email feedback to show@no-priors.com
Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Databricks | @Matei_Zaharia
Show Notes:
[01:29] - Origin of Databricks
[4:30] - Work at Stanford Lab
[5:29] - Dolly and Role of Open Source
[12:30] - Industry focus on high parameter count, understanding reasoning at small model scale
[18:42] - Enterprise applications for Dolly & chat bots
[25:06] - Making bets as an academic turned CTO
[36:23] - The early stages of AI and future predictions