Sarah Silverman’s AI Lawsuit Raises Important Copyright Infringement Questions
Comedian Sarah Silverman and two other authors are suing Meta and ChatGPT-maker OpenAI for alleged copyright infringement. The two lawsuits allege that the large language models that power AI were trained on copyrighted materials from their books without their knowledge or permission, which were likely stolen from a “shadow library” of pirated works.
The OpenAI suit includes exhibits claiming that AI bots were able to summarize three books when prompted: Silverman’s The Bedwetter, Ararat by Christopher Golden, and Richard Kadrey’s Sandman Slim. The Meta suit cites multiple works by Kadrey and Golden, alongside The Bedwetter, and flags a Meta paper that indicates LLaMA’s training datasets included material taken from these shadow libraries that the suit describes as “flagrantly illegal”.
The complaint against OpenAI states that “when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works—something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works.” The authors “did not consent to the use of their copyrighted books as training material for ChatGPT,” according to the complaint.
The case may be difficult to win after Google’s success in beating back legal challenges to its online book library in 2016 when the U.S. Supreme Court let stand lower court rulings that rejected authors’ claim that Google’s digitizing of millions of books and showing small portions of them to the public amount to “copyright infringement on an epic scale.”
Concerns about the tech industry’s AI-building practices have gained traction in literary and artist communities, with many worried that these trained tools could upend the livelihoods of creatives.
An open letter organized by the Authors Guild and signed by more than 4,000 writers reads: “Millions of copyrighted books, articles, essays and poetry provide the ‘food’ for AI systems, endless meals for which there has been no bill. You’re spending billions of dollars to develop AI technology. It is only fair that you compensate us for using our writings, without which AI would be banal and extremely limited.”
OpenAI and other top AI developers have been secretive about their sources of data and have yet to formally comment on the lawsuits. But once the case proceeds, tech executives could have to testify under oath about the sources of books they downloaded.