The source material for most generative AI models, these so-called large language models, is basically the entire internet. In simple terms, it's everything that's been posted on the internet in the past 10, 15 years. There are way, way too many images in this data set for anybody to go through it and make sure that they're cleaning it up.
As pressure mounts on lawmakers to regulate artificial intelligence, another problem area of the technology is emerging: AI-generated images. Early research shows these images can be biased and perpetuate stereotypes. Bloomberg reporters Dina Bass and Leonardo Nicoletti dug deep into the data that powers this technology, and they join this episode to talk about how AI image generation works—and whether it’s possible to train the models to produce better results.
Read more: Humans Are Biased. Generative AI Is Even Worse
Listen to The Big Take podcast every weekday and subscribe to our daily newsletter: https://bloom.bg/3F3EJAK
Have questions or comments for Wes and the team? Reach us at bigtake@bloomberg.net.
See omnystudio.com/listener for privacy information.