Companies like OpenAI are aware that a significant portion of English language text on the internet is copyrighted, yet they proceed to collect and use this data for technological advancement. In the pursuit of building advanced technology like chat GPT, these companies are willing to break and bend legal rules. As the companies exhaust the English language resources, they explore other options such as transcribing audio and video content from the internet. OpenAI developed Whisper, a speech recognition technology, to transcribe audio files accurately for further feeding into their systems.
A Times investigation shows how the country’s biggest technology companies, as they raced to build powerful new artificial intelligence systems, bent and broke the rules from the start.
Cade Metz, a technology reporter for The Times, explains what he uncovered.
Guest: Cade Metz, a technology reporter for The New York Times.
Background reading:
For more information on today’s episode, visit nytimes.com/thedaily. Transcripts of each episode will be made available by the next workday.