Ep 21: Modal CEO Erik Bernhardsson on Bringing Development to the Cloud, the GPU Market, and GenAI Music
Oct 31, 2023
auto_awesome
Erik Bernhardsson, founder of Modal Labs, discusses the AI chip market, popular GenAI use cases, Oracle Cloud's resurgence, challenges for AI developers, GPU access today, Vector DB companies, slow cloud adoption, AI music generation, over-hyped/under-hyped topics, and what Erik wishes he knew when starting Modal.
Modal's strength lies in its ability to quickly boot containers, provide GPU support, and offer serverless capabilities, making it an attractive platform for running generative AI models.
Modal solves the problems of deploying custom models, high costs, and lack of customization by providing a serverless platform that supports custom models, fast container booting, and efficient GPU utilization, resulting in lower costs, faster feedback loops, and a better developer experience.
The future of cloud computing is predicted to involve all development happening in the cloud, with higher-level services offering improved developer experiences, removing complexities and enabling seamless local development environments.
Deep dives
Modal's journey from general compute platform to generative AI applications
Modal started as a general compute platform, allowing data teams to run any type of workload. However, over the years, they have emerged as a powerful tool for building generative AI applications. The CEO, Eric Bernardsen, discusses the motivation behind founding Modal, which was to create better tools for working with data. As they delved deeper into the infrastructure problem, they realized the need to build their own file system and focus on making it easier to run code in the cloud. Modal's strength lies in its ability to quickly boot containers, provide GPU support, and offer serverless capabilities, making it an attractive platform for running generative AI models.
The advantages of using Modal for custom code and serverless GPU inference
Modal offers a solution for custom code deployment and serverless GPU inference, which alleviates the complexities and high costs associated with running large-scale models. Developers have the option to run their own Kubernetes cluster, but it can be cumbersome and expensive. On the other hand, other providers offer AI behind an API, but lack customization and flexibility for proprietary models or custom workflows. Modal solves these problems by providing a serverless platform that supports custom models, fast container booting, and efficient GPU utilization. With Modal, developers can experience lower costs, faster feedback loops, and a better overall developer experience.
The future of AI in terms of model layer, GPU access, and cost optimization
The podcast delves into the future of AI, discussing potential developments in the model layer, GPU access, and cost optimization. While there are debates about whether a single model will dominate the cloud or if there will be a broader range of open-source models, the guest expresses optimism for open-source models to gain traction and potentially close the gap with proprietary models. They also explore the dynamics of GPU access and changing GPU markets, highlighting the challenges of acquiring GPUs at scale and the potential for GPU prices to normalize. Furthermore, they discuss the optimization of inference costs and the potential for future improvements in performance and pricing. Finally, the guest shares their excitement for the growth of vector databases and vector-based applications, which can unlock new opportunities in various domains, including generative AI.
Benefits of Cloud Adoption and Future of Cloud Computing
The podcast episode discusses the current state of cloud adoption and the future of cloud computing. The guest speaker highlights that many developers still write code as if the cloud doesn't exist, deploying code into the cloud but not fully embracing cloud-based development workflows. However, the speaker predicts that all development will eventually move into the cloud, with higher-level services offering improved developer experiences. The guest emphasizes the potential for the cloud to enable seamless local development experiences that mirror cloud environments, removing the complexities of switching between different environments. Despite acknowledging the challenges in the developer experience, the guest remains bullish on major cloud providers like AWS, GCP, Azure, and even Oracle, noting their economies of scale and ability to offer high-quality services. Overall, the guest foresees a future where development increasingly happens in the cloud.
The Potential Impact of AI on Cloud Computing and the Rise of Modal
In the podcast episode, the guest shares insights on the potential impact of AI workloads on cloud computing and discusses the rise of Modal, a platform that offers an extensive developer experience for running AI models in the cloud. The guest expresses optimism for the big three cloud providers, AWS, GCP, and Azure, highlighting their economies of scale and data center investments. However, the guest also notes the emergence of new players like CoreWeave and the importance of GPU capabilities in the cloud space. The guest emphasizes the need for high-caliber compute and higher-level services that focus on specific use cases and workflows. Additionally, the guest explores the challenges of selling to enterprise companies, where platform teams and AI teams often have different perspectives and needs. Despite these challenges, the guest believes that Modal's developer-centric approach and comprehensive toolset position it well in the AI-driven cloud computing landscape.
Jacob and Pat sit down with Erik Bernhardsson, the founder of Modal Labs, a data infrastructure company providing GPU compute to data teams. On this episode we discussed Erik’s thoughts on the AI chip market, the most popular GenAI use cases on Modal, and even Oracle Cloud’s resurgence in the AI start-up market.
0:00 intro
0:45 motivation for founding Modal
6:35 advantages that Modal gives developers
9:21 early applications built with Modal
11:58 challenges for AI developers
16:31 GPU access today
20:09 Vector DB companies
24:55 why is cloud adoption so slow?
31:30 Oracle Cloud
39:22 AI music generation
42:05 over-hyped/under-hyped
43:26 what Erik wishes he knew when starting Modal
45:53 episode debrief
With your co-hosts:
@ericabrescia
- Former COO Github, Founder Bitnami (acq’d by VMWare)
@patrickachase
- Partner at Redpoint, Former ML Engineer LinkedIn
@jacobeffron
- Partner at Redpoint, Former PM Flatiron Health
@jordan_segall
- Partner at Redpoint
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode