The Atlantic

Revealed: The Authors Whose Pirated Books Are Powering Generative AI

Stephen King, Zadie Smith, and Michael Pollan are among thousands of writers whose copyrighted works are being used to train large language models.
Source: Illustration by The Atlantic. Source: Getty.

Updated at 1:40 p.m. ET on September 25, 2023

Editor’s note: This article is part of The Atlantic’s series on Books3. Check out our searchable Books3 database to find specific authors and titles. A deeper analysis of what is in the database is here.

One of the most troubling issues around generative AI is simple: It’s being made in secret. To produce humanlike answers to questions, systems such as ChatGPT process huge quantities of written material. But few people outside of companies such as Meta and OpenAI know the full extent of the texts these programs have been trained on.

Some comes from Wikipedia and other online writing, but high-quality generative AI requires higher-quality input than is usually found on the internet—that is, it requires the kind found in books. In a filed in California last month, the writers Sarah Silverman, Richard Kadrey, and Christopher Golden allege that Meta violated copyright laws by using their books to train LLaMA, a large language model similar to OpenAI’s —an algorithm that can generate text by mimicking the word patterns it finds in sample texts. But neither the lawsuit itself nor the commentary surrounding it has offered a look under the hood: We have not previously known for certain whether LLaMA was trained on Silverman’s, Kadrey’s, or Golden’s books, or any

You’re reading a preview, subscribe to read more.

More from The Atlantic

The Atlantic5 min read
The Strangest Job in the World
This is an edition of the Books Briefing, our editors’ weekly guide to the best in books. Sign up for it here. The role of first lady couldn’t be stranger. You attain the position almost by accident, simply by virtue of being married to the president
The Atlantic6 min read
The Happy Way to Drop Your Grievances
Want to stay current with Arthur’s writing? Sign up to get an email every time a new column comes out. In 15th-century Germany, there was an expression for a chronic complainer: Greiner, Zanner, which can be translated as “whiner-grumbler.” It was no
The Atlantic6 min read
There’s Only One Way to Fix Air Pollution Now
It feels like a sin against the sanctitude of being alive to put a dollar value on one year of a human life. A year spent living instead of dead is obviously priceless, beyond the measure of something so unprofound as money. But it gets a price tag i

Related Books & Audiobooks