Authors, including Douglas Preston, allege that their novels were used without permission to train OpenAI's AI program ChatGPT. The podcast explores the controversy around this alleged copyright infringement and discusses the ethics of tech companies using copyrighted material. It also highlights other cases of tech giants facing copyright infringement allegations. The positive impact of lawsuits on tech companies, as demonstrated by the Spotify case, is also discussed.
OpenAI's chatbot, Chat GPT, demonstrated a deep understanding of copyrighted books, raising concerns about potential unauthorized use of authors' works by the AI.
The OpenAI lawsuit is likely to result in a negotiated settlement rather than a full trial, as past copyright class-action lawsuits have commonly led to settlements.
Deep dives
The Curiosity of a Writer
Douglas Preston, an author, explores the capabilities of OpenAI's chatbot, Chat GPT. He discovers that the AI knows details about his books, characters, and settings that are beyond what is publicly available. This raises concerns about whether OpenAI has used copyrighted material without permission. Preston joins a class-action lawsuit against OpenAI, alleging copyright infringement on an industrial scale.
Precedents in Copyright Infringement
The podcast delves into two notable legal cases: Google's book-scanning project and Spotify's streaming of unlicensed songs. In the Google case, the court deemed the mass copying of copyrighted books to be fair use, citing the societal benefit of creating a searchable database. In the Spotify case, a class-action lawsuit led to a settlement wherein Spotify paid for past copyright infringements and established a system for future streaming royalties. These cases provide insight into the potential outcomes of the lawsuit against OpenAI.
The Role of Lawsuits and Negotiations
The podcast highlights the likelihood of the OpenAI lawsuit leading to a negotiated settlement rather than a full trial. Past copyright class-action lawsuits have rarely gone to trial, with parties often reaching a settlement agreement. OpenAI stands to benefit from negotiating with the authors who collectively sued them, as it would streamline licensing deals for the vast amount of copyrighted material used to train their AI.
When best-selling thriller writer Douglas Preston began playing around with OpenAI's new chatbot, ChatGPT, he was, at first, impressed. But then he realized how much in-depth knowledge GPT had of the books he had written. When prompted, it supplied detailed plot summaries and descriptions of even minor characters. He was convinced it could only pull that off if it had read his books.
Large language models, the kind of artificial intelligence underlying programs like ChatGPT, do not come into the world fully formed. They first have to be trained on incredibly large amounts of text. Douglas Preston, and 16 other authors, including George R.R. Martin, Jodi Piccoult, and Jonathan Franzen, were convinced that their novels had been used to train GPT without their permission. So, in September, they sued OpenAI for copyright infringement.
This sort of thing seems to be happening a lot lately–one giant tech company or another "moves fast and breaks things," exploring the edges of what might or might not be allowed without first asking permission. On today's show, we try to make sense of what OpenAI allegedly did by training its AI on massive amounts of copyrighted material. Was that good? Was it bad? Was it legal?