Ideogram CEO Mohammad Norouzi discusses the evolution of transformer models, diffusion models, and the impact on AI technology. He shares insights on transitioning from research to startup CEO, user-centric product development, and fostering creativity with AI in image models.
37:17
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
question_answer ANECDOTE
Early AI Passion and Learning
Mohammad Norouzi grew up in Iran and spent his early years drawing and listening to stories.
He self-taught neural networks in 2007 by reading academic papers and implementing models from scratch.
insights INSIGHT
Diffusion vs Transformer Models
Diffusion models generate images by starting from noise and refining it iteratively, unlike transformers that generate token by token.
This iterative refinement aligns more closely with how humans create art, starting from a sketch and refining it.
question_answer ANECDOTE
The Transformer Paper's Surprising Impact
Mohammad discussed the release of the transformer paper with its author, Ashish Vaswani, who saw its importance immediately.
Nobody initially envisioned the transformer architecture would revolutionize both language and vision tasks so broadly.
Get the Snipd Podcast app to discover more snips from this episode
In this episode, Ideogram CEO Mohammad Norouzi joins a16z General Partner Jennifer Li, as well as Derrick Harris, to share his story of growing up in Iran, helping build influential text-to-image models at Google, and ultimately cofounding and running Ideogram. He also breaks down the differences between transformer models and diffusion models, as well as the transition from researcher to startup CEO.
Here's an excerpt where Mohammad discusses the reaction to the original transformer architecture paper, "Attention Is All You Need," within Google's AI team:
"I think [lead author Asish Vaswani] knew right after the paper was submitted that this is a very important piece of the technology. And he was telling me in the hallway how it works and how much improvement it gives to translation. Translation was a testbed for the transformer paper at the time, and it helped in two ways. One is the speed of training and the other is the quality of translation.
"To be fair, I don't think anybody had a very crystal clear idea of how big this would become. And I guess the interesting thing is, now, it's the founding architecture for computer vision, too, not only for language. And then we also went far beyond language translation as a task, and we are talking about general-purpose assistants and the idea of building general-purpose intelligent machines. And it's really humbling to see how big of a role the transformer is playing into this."