Tuhin, an expert in model deployment and monitoring at any scale, joins the show to discuss self-hosting open access models. They explore trends in tooling and usage of open access models, common use cases for integrating self-hosted models, and how the boom in generative AI has influenced the ecosystem.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Base 10 offers a platform that simplifies model deployment with workflow management features.
Base 10 focuses on scalability, data privacy, and multi-cloud deployment to meet diverse user needs.
Deep dives
Base 10: Building Infrastructure for ML
Base 10 is a company that focuses on hosting machine learning models in a production-ready and scalable manner. They provide a platform that abstracts away the infrastructure complexities of running models, allowing engineers and data scientists to focus on their core work. Base 10 offers an open-source library called Trust, which enables users to easily package their models and deploy them with just a few lines of code. The platform also provides workflow management features, such as versioning, A/B testing, observability, and logging, making it easier for teams to iterate and manage their models in production.
Challenges of Running ML Models in Production
Running machine learning models in production comes with various challenges, such as latency, throughput, cost, and security. Base 10 addresses these challenges by providing a scalable service that can handle variable traffic and ensures high-speed production inference. They also prioritize data privacy and security, allowing users to deploy models within their own VPCs or AWS accounts to maintain control and ownership of their data. Base 10 focuses on simplifying the deployment workflow, offering version management, observability, and logging features to streamline the process and boost productivity.
Expanding Infrastructure Support and Multi-Cloud Deployment
Base 10 is continuously expanding its infrastructure support to accommodate various frameworks and runtime environments. They are actively working on compatibility with frameworks like TRTorch, ONNX, and TFLite, allowing users to bring their own frameworks and containers to run models. Additionally, they are releasing a multi-cluster feature that enables users to deploy models across multiple cloud providers, bringing flexibility and control to enterprise customers. This multi-cloud approach caters to the growing demand for self-hosted solutions and unlocks new opportunities for scalability and performance optimization.
Future of Base 10: Fine-tuning and Data Set Collection
Looking ahead, Base 10 plans to delve into fine-tuning and data set collection. They believe fine-tuning is crucial to give users more control over pre-trained models and customization. By aligning with open AI endpoints, they aim to collect data sets and create more fine-tuned models for their users. This approach opens up possibilities for improved model performance and personalized AI experiences. Base 10 sees significant opportunities for individuals and companies to innovate and build tools that support the evolving AI and ML landscape.
We’re excited to have Tuhin join us on the show once again to talk about self-hosting open access models. Tuhin’s company Baseten specializes in model deployment and monitoring at any scale, and it was a privilege to talk with him about the trends he is seeing in both tooling and usage of open access models. We were able to touch on the common use cases for integrating self-hosted models and how the boom in generative AI has influenced that ecosystem.
Changelog++ members save 1 minute on this episode because they made the ads disappear. Join today!
Sponsors:
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com
Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.
Typesense – Lightning fast, globally distributed Search-as-a-Service that runs in memory. You literally can’t get any faster!