Step aside Margaret Atwood and Stephen King, here’s a dystopian horror story for you: imagine spending years of your life toiling over a keyboard, carefully crafting sentences, paragraphs and chapters of a book that weaves together a plot, brings characters to life and enthralls readers for hundreds of pages. Now, watch as an AI eats it and spits your garbled works back at you.
Ever since OpenAI’s ChatGPT brought large language models (LLMs) to the wider world’s attention, it’s been suspected that writers’ work has been used to train them. LLMs are deep-learning systems with hundreds of millions of parameters that are taught to assemble sentences by hoovering up massive text datasets. Datasets such as most of the English-language internet, including Wikipedia.
Writers are speaking up. In June 2023, the American Authors Guild published a letter signed by thousands of members: “Millions of copyrighted books, articles, essays and poetry provide the ‘food’ for AI systems, endless meals for which there has been no bill. You’re spending billions of dollars to develop AI technology. It is only fair that you compensate us for using our writings, without which AI would be banal and extremely limited.” The UK’s Society of Authors (SoA) has also submitted evidence to a House of Lords inquiry into the subject, calling to protect copyright.
In July 2023, a trio of writers – Christopher Golden, Richard Kadrey). And digital copies of their books are contained in that “pile”, which wasn’t compiled by Meta but was apparently used to train its LLM.