#27 - Hagay Lupesko, VP of Engineering at MosaicML/Databricks
Dec 9, 2023
auto_awesome
Hagay Lupesko, VP of Engineering at MosaicML/Databricks, explores the capabilities and future of language models, the cost of training and fine-tuning models, the recent acquisition of Mosaic ML by Databricks, data security and privacy concerns, the debate between open source and closed source models, and how Mosaic ML language models compare to other models in the market.
MosaicML is focused on making advanced AI accessible to all organizations, not just big tech companies, by providing a platform for training and serving large and complex deep learning models, with a specific focus on language models.
Mosaic ML's acquisition by Databricks aims to enhance the platform's capabilities with generative AI, making large-scale AI more accessible to enterprises and integrating improved security, data privacy, and compatibility features.
Deep dives
About Mosaic ML and Language Models
Mosaic ML is a company that enables organizations to build large-scale AI models, including language models. Language models are capable of generating text or classifying text. They have surpassed human capability in many tasks, just like computer vision models did a few years ago. Mosaic ML focuses on making advanced AI accessible to all organizations, not just big tech companies. They offer a platform for training and serving large and complex deep learning models, with a specific focus on language models.
The Genesis of Mosaic ML
Mosaic ML was founded with the mission of making advanced AI accessible to any organization, regardless of size or funding. The company was co-founded by Naveen Rao and Andy Tang, who had extensive experience in the industry. They initially built an open-source library called Composer, optimized for distributed training of neural networks, as part of their mission. Later on, they developed the Mosaic ML platform for training and serving large language models and other models like text-to-image models, providing a solution that addresses the infrastructure considerations and efficiency optimizations necessary for large-scale AI.
Cost and Serving Efficiency of Language Models
The cost of training a language model depends on factors like the model size and the amount of data used for training. For example, training Mosaic ML's 7 billion parameter model on one trillion tokens can cost around $200k to $250k. Serving large-scale models efficiently is another important consideration. Mosaic ML focuses on optimizing serving costs and latency, designing models that can be efficiently served on standard hardware. This helps organizations to effectively deploy and use models without incurring high operational costs. Mosaic ML is also developing parameter-efficient fine-tuning techniques to further enhance cost-effectiveness.
Mosaic ML's Partnership with Databricks and Future Plans
Mosaic ML has been acquired by Databricks, a leading data and AI company. The acquisition aims to enhance Databricks' platform capabilities with generative AI, making large-scale AI more accessible to enterprises. Mosaic ML is integrating its capabilities into the Databricks platform, providing improved security, data privacy, and compatibility features. In terms of future plans, Mosaic ML is focusing on providing the best generative AI capabilities on the Databricks platform. They will continue to optimize and develop their models, aiming to offer new models that balance model quality, latency, cost, security, and compatibility with hardware infrastructure.
Join us with Hagay Lupesko, VP of Engineering at MosaicML/Databricks, as we dive into the rapidly evolving world of large language models (LLMs). We'll discuss the latest innovations and challenges in the field, with a spotlight on MosaicML's unique contributions. Additionally, we explore how MosaicML's strategies compare with those of industry giants like OpenAI/Microsoft, Anthropic/AWS, and open-source initiatives such as Meta's LLaMA-2. Hagay provides expert insights into the varied approaches driving AI's future, making this a crucial listen for anyone interested in understanding the trends and potentials of large language models.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode