Speed will win the AI computing battle with Tuhin Srivastava from Baseten
Mar 21, 2024
38:32
auto_awesome Snipd AI
In this podcast, Tuhin Srivastava discusses how speed will be the key in the AI computing battle. They talk about efficient code solutions, surprising use cases for Baseten, and the defensibility of jobs in the AI industry. They also touch on the importance of speed, optimizations for different models, and the impact of hardware shortages.
Read more
AI Summary
Highlights
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Efficient code solutions are more desirable than no code for AI products' speed and performance.
Base 10's focus on speed and efficiency in AI computing infrastructure gives them a competitive edge in the industry.
Deep dives
Base 10: Providing Fast Scalable AI Infrastructure for Teams
Base 10 offers fast scalable AI infrastructure focused on inference, catering to engineering teams handling large models. The company's emphasis on providing efficient code over no code solutions seeks to empower engineers to write code and optimize performance. By offering strong intuitive abstractions, Base 10 aims to simplify both easy and complex tasks, supporting teams as they grow in scale.
Application Diversity on Base 10: From Small Projects to AI-Native Companies
Base 10 serves a wide range of applications, from small weekend side projects to AI-native companies like Descript. It powers AI features across various projects, facilitating quick model deployment and workload co-location for rapid response times. Noteworthy partnerships include aiding Plan AI in building a call center SDK for quick model shipping efficiently, and supporting Picnic Health's innovative data-driven medical record extraction models.
Training vs. Inference Workloads: Key Differences and Customer Needs
Inference and training workloads exhibit distinct customer SLA requirements and operational considerations. In inference, cluster co-location and network optimization are crucial for fast responses. Unlike training, inference workflows are more repetitive and standardized, requiring specific optimizations like speculative decoding. The customer shift towards buying technology, particularly in inference infrastructure, underscores the importance of speed and efficiency in deploying AI models.
Hardware Heterogeneity Challenges and Customers' Build vs. Buy Dilemma
Hardware heterogeneity poses challenges for AI infrastructure providers like Base 10, with complexities in optimizing solutions across different chips. Customers lean towards the buy side due to the criticality of speed in the market, opting for ready-made efficient solutions over building infrastructure from scratch. The shift towards recognizing infrastructure as non-proprietary allows companies to focus on product differentiation, leveraging external AI infrastructure for improved performance and scalability.
At a time when users are being asked to wait unthinkable seconds for AI products to generate art and answers, speed is what will win the battle heating up in AI computing. At least according to today’s guest, Tuhin Srivastava, the CEO and co-founder of Baseten which gives customers scalable AI infrastructures starting with interference. In this episode of No Priors, Sarah, Elad, and Tuhin discuss why efficient code solutions are more desirable than no code, the most surprising use cases for Baseten, and why all of their jobs are very defensible from AI.