Multimodal models are making it possible to create AI art and augment creativity across artistic mediums. This week on No Priors, Sarah and Elad talk with Suhail Doshi, the founder of Playground AI, an image generator and editor. Playground AI has been open-sourcing foundation diffusion models, most recently releasing Playground V2.5.
In this episode, Suhail talks with Sarah and Elad about how the integration of language and vision models enhances the multimodal capabilities, how the Playground team thought about creating a user-friendly interface to make AI-generated content more accessible, and the future of AI-powered image generation and editing.
Sign up for new podcasts every week. Email feedback to show@no-priors.com
Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Suhail
Show Notes:
(0:00) Introduction
(0:52) Focusing on image generation
(3:01) Differentiating from other AI creative tools
(5:58) Training a Stable Diffusion model
(8:31) Long term vision for Playground AI
(15:00) Evolution of AI architecture
(17:21) Capabilities of multimodal models
(22:30) Parallels between audio AI tools and image-generation