AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Exploring the Capabilities of Sora Vision Model
The chapter dives deep into the Sora vision model, discussing its technologies and operation in predicting future frames in videos. It also explores the model's abilities in generating Minecraft videos and different camera angles for video scenes, as well as the challenges and potential applications of such advanced models.
Emil is the co-founder of palette.fm (colorizing B&W pictures with generative AI) and was previously working in deep learning for Google Arts & Culture.
We were talking about Sora on a daily basis, so I decided to record our conversation, and then proceeded to confront him about AI risk.
Patreon: https://www.patreon.com/theinsideview
Sora: https://openai.com/sora
Palette: https://palette.fm/
Emil: https://twitter.com/EmilWallner
OUTLINE
(00:00) this is not a podcast
(01:50) living in parallel universes
(04:27) palette.fm - colorizing b&w pictures
(06:35) Emil's first reaction to sora, latent diffusion, world models
(09:06) simulating minecraft, midjourney's 3d modeling goal
(11:04) generating camera angles, game engines, metadata, ground-truth
(13:44) doesn't remove all artifacts, surprising limitations: both smart and dumb
(15:42) did sora make emil depressed about his job
(18:44) OpenAI is starting to have a monopoly
(20:20) hardware costs, commoditized models, distribution
(23:34) challenges, applications building on features, distribution
(29:18) different reactions to sora, depressed builders, automation
(31:00) sora was 2y early, applications don't need object permanence
(33:38) Emil is pro open source and acceleration
(34:43) Emil is not scared of recursive self-improvement
(36:18) self-improvement already exists in current models
(38:02) emil is bearish on recursive self-improvement without diminishing returns now
(42:43) are models getting more and more general? is there any substantial multimodal transfer?
(44:37) should we start building guardrails before seeing substantial evidence of human-level reasoning?
(48:35) progressively releasing models, making them more aligned, AI helping with alignment research
(51:49) should AI be regulated at all? should self-improving AI be regulated?
(53:49) would a faster emil be able to takeover the world?
(56:48) is competition a race to bottom or does it lead to better products?
(58:23) slow vs. fast takeoffs, measuring progress in iq points
(01:01:12) flipping the interview
(01:01:36) the "we're living in parallel universes" monologue
(01:07:14) priors are unscientific, looking at current problems vs. speculating
(01:09:18) AI risk & Covid, appropriate resources for risk management
(01:11:23) pushing technology forward accelerates races and increases risk
(01:15:50) sora was surprising, things that seem far are sometimes around the corner
(01:17:30) hard to tell what's not possible in 5 years that would be possible in 20 years
(01:18:06) evidence for a break on AI progress: sleeper agents, sora, bing
(01:21:58) multimodality transfer, leveraging video data, leveraging simulators, data quality
(01:25:14) is sora is about length, consistency, or just "scale is all you need" for video?
(01:26:25) highjacking language models to say nice things is the new SEO
(01:27:01) what would michael do as CEO of OpenAI
(01:29:45) on the difficulty of budgeting between capabilities and alignment research
(01:31:11) ai race: the descriptive pessimistive view vs. the moral view, evidence of cooperation
(01:34:00) making progress on alignment without accelerating races, the foundational model business, competition
(01:37:30) what emil changed his mind about: AI could enable exploits that spread quickly, misuse
(01:40:59) michael's update as a friend
(01:41:51) emil's experience as a patreon
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode