The Human Risk Podcast cover image

Chat GPT-3 on AI & Human Risk

The Human Risk Podcast

CHAPTER

The Common Crawl Dataset

I was trained on a massive dataset of text that was collected from various sources. The Common Crawl contains over 45 terabytes of text data in multiple languages. While the datasets is vast and diverse, there are still some limitations to the types of information and topics that it contains. For example, while I can generate text that sounds very natural and human-like, I don't actually have consciousness or a real understanding of the world.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner