2min chapter

The Data Exchange with Ben Lorica cover image

How DALL·E works

The Data Exchange with Ben Lorica

CHAPTER

How to Make Sure Your Training Data Is as Clean as Possible

It's a more difficult problem than you'd imagine. Even for very simple prompts, we're not unajes, like producing squares in the exact same locations. Different types of mitigations actually will run counter to each other. And so if you remove a large fraction of kind of tis content your data set, then you actually have less woman representation an your data set. So it's actually a complicated series of trade offs. K wewere not trying to claim we'v fully solved it. I think there's still a lot of work to be done.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode