Science Quickly cover image

Generative AI Models Are Sucking Up Data from All Over the Internet, Yours Included

Science Quickly

00:00

Exploring the Sources of Data for Generative AI Models

Discussion on the data sources used for training GPT-4, including Commoncrawl and Webcrawler, as well as various types of open data available on the internet. Mention of Meta's investment in AI development.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app