Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We’re collaborating with friends at Amplify to launch the first State of AI Engineering survey! Please fill it out (and tell your friends)!
In March, we started off our GPT4 coverage framing one of this year’s key forks in the road as the “Year of Multimodal vs Multimodel AI”. 6 months in, neither has panned out yet. The vast majority of LLM usage still defaults to chatbots built atop OpenAI (per our LangSmith discussion), and rumored GPU shortages have prevented the broader rollout of GPT-4 Vision. Most "AI media” demos like AI Drake and AI South Park turned out heavily human engineered, to the point where the AI label is more marketing than honest reflection of value contributed.
However, the biggest impact of multimodal AI in our lives this year has been a relatively simple product - the daily HN Recap podcast produced by Wondercraft.ai, a 5 month old AI podcasting startup. As swyx observed, the “content flippening” — an event horizon when the majority of content you choose to consume is primarily AI generated/augmented rather than primarily human/manually produced — has now gone from unthinkable to possible.
For full show notes, go to: https://latent.space/p/wondercraft
Timestamps
* [00:03:15] What is Wondercraft?
* [00:08:22] Features of Wondercraft
* [00:10:42] Types of Podcasts
* [00:11:44] The Importance of Consistency
* [00:14:01] Wondercraft House Podcasts
* [00:19:27] Video Translation and Dubbing
* [00:21:49] Building Wondercraft in 1 Day
* [00:24:25] What is your moat?
* [00:30:37] Audio Generation stack
* [00:32:12] How Important is it to Sound Human? and AI Uncanny Valley
* [00:36:02] AI Watermarking
* [00:36:32] The Text to Speech Industry
* [00:41:19] Voice Synthesis Research
* [00:45:53] AI Podcaster interviews Human Podcaster
* [00:50:38] Takeaway
Get full access to Latent Space at
www.latent.space/subscribe