Jonathan Frankle, Chief Scientist at MosaicML, discusses the future of training specialized models, MosaicML inside Databricks, and responsible AI practices. They explore LLM-based systems, one-way hash functions, and the integration of Databricks platform. The trade-off between model size, cost, and complexity in AI training is also touched upon.
Custom AI models tailored to specific data are more effective in solving unique problems.
Mosaic ML recommends a crawl-walk-run approach to finding the optimal trade-off between model size, training time, and data quantity.
Mosaic ML focuses on efficiency and cost reduction, making custom model training accessible to businesses of all sizes.
Deep dives
Customization through Training Specialized Models
Mosaic ML focuses on training custom AI models tailored to a company's specific data, goals, and requirements. They believe that one-size-fits-all models are not efficient in solving unique problems. By providing the tools and infrastructure, Mosaic enables customers to train specialized purpose-built models, such as generative AI models, to gain a competitive edge in their respective industries.
The Process of Custom Model Training
The process of training custom models starts with gathering a large, relevant text data set. The emphasis is on natural language sentence form in the data. It is crucial to eliminate low-quality data and experiment with different data sources and types. Mosaic ML recommends a crawl-walk-run approach, starting small and gradually scaling up the model size and the training data size. The goal is to find the optimal trade-off between model size, training time, and data quantity.
Efficiency and Cost Considerations
Mosaic ML places a strong emphasis on efficiency and reducing costs for customers. They have pioneered research in algorithm design to make training and inference processes more efficient. Additionally, they aim to reduce the cost of model deployment by considering factors like model size, training time, ease of serving, and computational feasibility. Mosaic ML's goal is to make custom model training affordable and accessible to businesses of all sizes.
Adapting to Changing Data and Timeliness
The need for updating and retraining models depends on the specific use case and the importance of timeliness. Some applications require frequent updates, while others may not be time-sensitive. Mosaic ML encourages experimentation to find the optimal balance between retraining models and incorporating new data. They suggest testing different approaches, such as training on subsets of the data or fine-tuning existing models with new data, to determine the most effective strategy.
Considerations for Model Size and Serving Efficiency
Model size and serving efficiency play a crucial role in the cost and performance of AI models. Mosaic ML recommends starting with smaller models and gradually increasing their size based on the specific application's needs. They also highlight the importance of considering the ease of training and serving larger models, as they can be more challenging and costly to manage. Balancing the trade-off between model size, training time, and serving efficiency is crucial for achieving optimal performance.
Jonathan Frankle is the Chief Scientist at MosaicML, which was recently bought by Databricks for $1.3 billion.
MosaicML helps customers train generative AI models on their data. Lots of companies are excited about gen AI, and the hope is that their company data and information will be what sets them apart from the competition.
In this conversation with Tristan and Julia, Jonathan discusses a potential future where you can train specialized, purpose-built models, the future of MosaicML inside of Databricks, and the importance of responsible AI practices.
For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.
The Analytics Engineering Podcast is sponsored by dbt Labs.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode