Deep Floyd IF: A Generative AI Toy That Can Understand Text in Context

4min Snip

00:00

Play full episode

Summary

Transcript

Episode notes

Deep Floyd IF was trained on a data set called Lion, L-A-I-O-N. This is going to help IF understand text in context and be able to produce that in a way that other models can't. The first prompt was a film camera photo of a 1960 Southern California beachside burger restaurant,. So you can see this is what mid-journey came back with. Still, the thing that people are most excited about beyond a shadow of a doubt is the fact that this model can actually get text into images - which is just something we haven't had yet.

If you've ever used Midjourney, Dall-E, Stable Diffusion or another text-to-image generator, you'll know that words are a weakness. Text (such as on signs) tends to be gibberish. DeepFloyd IF has started to solve that problem and it's doing it open source. Referenced in the video: https://twitter.com/DeepFloydIF https://twitter.com/EMostaque/status/1652295961404645376 https://stability.ai/blog/deepfloyd-if-text-to-image-model https://twitter.com/hardmaru/status/1651822596844048385 https://the-decoder.com/deepfloyd-if-is-a-crazy-good-text-to-image-model-and-open-source/ https://wandb.ai/geekyrakshit/deepfloyd/reports/A-Gentle-Introduction-to-DeepFloydAI-s-New-Diffusion-Model-IF--VmlldzozNTY3Nzc4 https://twitter.com/javilopen/status/1652387049268297729 https://huggingface.co/DeepFloyd https://twitter.com/DavidVorick/status/1652070967412129793 Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/