Robert Nishihara, Anyscale CEO, discusses scaling AI models, generative AI's impact on enterprise interest, and the importance of quick deployment. The conversation covers challenges of handling massive amounts of data, evolving AI workloads, and the transition to multimodal AI models integrating text, audio, video, and images.
Generative AI is driving companies to prioritize AI capabilities for competitive advantage in today's business landscape.
Multimodal AI applications require new solutions in AI architecture to handle data-intensive workloads effectively.
Deep dives
The Future of AI: Embracing Multimodal Data
AI models are evolving to incorporate various types of data such as text, audio, video, and images, leading to more data-intensive applications. This shift to multimodal data usage presents new challenges as existing systems are not designed to handle both GPU and data-intensive workloads, necessitating new solutions in AI architecture. Companies like AnyScale are focusing on commercializing open-source projects like Ray from UC Berkeley to support these evolving AI workloads.
Ray's Origins and Evolution
Ray, an open-source project, initially aimed to scale compute-intensive AI workloads. As the AI landscape evolved, Ray adapted to support a wide range of AI applications, offering low-level core APIs and a rich library ecosystem for tasks like scaling training, inference, and data processing. Companies like Uber, Pinterest, Shopify, and Spotify leverage Ray for training and scaling their AI workloads effectively.
Challenges of Multimodal AI Applications
With the rise of multimodal AI applications, data intensity has become a significant focus compared to the previous emphasis on compute intensity. As AI expands into processing video and audio data alongside text, handling data-intensive workloads poses new obstacles. Ray excels in managing the complexities of working with large datasets and GPU-intensive tasks, addressing critical challenges faced by companies dealing with extensive data processing and machine learning.
Future Prospects and System Challenges
The future of AI and machine learning lies in tackling complexities arising from the convergence of various computational resources and frameworks. To optimize performance and efficiency for heterogeneous workloads, solutions like Ray are pivotal. The evolving landscape calls for advancements in system design, data processing, and multi-accelerator support to meet the growing demands of AI applications and ensure seamless integration with diverse hardware resources.
In this episode of the AI + a16z podcast, Anyscale cofounder and CEO Robert Nishihara joins a16z's Jennifer Li and Derrick Harris to discuss the challenges of training and running AI models at scale; how a focus on video models — and the huge amount of data involved — will change generative AI models and infrastructure; and the unique experience of launching a company out of the UC-Berkeley Sky Computing Lab (the successor to RISElab and AMPLab).
Here's a sample of the discussion, where Robert explains how generative AI has turbocharged the appetite for AI capabilities within enterprise customers:
"Two years ago, we would talk to companies, prospective customers, and AI just wasn't a priority. It certainly wasn't a company-level priority in the way that it is today. And generative AI is the reason a lot of companies now reach out to us . . . because they know that succeeding with AI is essential for their businesses, it's essential for their competitive advantage.
"And time to market matters for them. They don't want to spend a year hiring an AI infrastructure team, building up a 20-person team to build all of the internal infrastructure, just to be able to start to use generative AI. That's something they want to do today."
At another point in the discussion, he notes on this same topic:
"One dimension where we try to go really deep is on the developer experience and just enabling developers to be more productive. This is a complaint we hear all the time with machine learning teams or infrastructure teams: They'll say that they hired all these machine learning people, but then the machine learning people are spending all of their time managing clusters or working on the infrastructure. Or they'll say that it takes 6 weeks or 12 weeks to get a model to transition from development to production . . . Or moving from a laptop to the cloud, and to go from single machine to scaling — these are expensive handoffs often involve rewriting a bunch of code."