Building AI Safely Is Getting Harder and Harder
This is Atlantic Intelligence, an eight-week series in which The Atlantic’s leading thinkers on AI will help you understand the complexity and opportunities of this groundbreaking technology. Sign up here.
The bedrock of the AI revolution is the internet, or more specifically, the ever-expanding bounty of data that the web makes available to train algorithms. ChatGPT, Midjourney, and other generative-AI models “learn” by detecting patterns in massive amounts of text, images, and videos scraped from the internet. The process entails hoovering up huge quantities of books, art, memes, and, inevitably, the troves of racist, sexist, and illicit material distributed across the web.
Earlier this week, Stanford researchers a particularly alarming example of that toxicity: The largest publicly available of LAION-5B while it the report’s findings, although this and earlier versions of the data set have already trained prominent AI models.
You’re reading a preview, subscribe to read more.
Start your free 30 days