Researchers Gain New Understanding From Simple AI
In the last two years, artificial intelligence programs have reached a surprising level of linguistic fluency. The biggest and best of these are all based on an architecture invented in 2017 called the transformer. It serves as a kind of blueprint for the programs to follow, in the form of a list of equations.
But beyond this bare mathematical outline, we don’t really know what transformers are doing with the words they process. The popular understanding is that they can somehow pay attention to multiple words at once, allowing for an immediate “big picture” analysis, but how exactly this works—or if it’s even an accurate way of understanding transformers—is unclear. We know the ingredients, but not the recipe.
Now, two studies by researchers from the company Anthropic have started to figure out, fundamentally, what transformers are doing when they process and generate text. In their first , released in December of the Technion in Haifa, Israel. “I’m very positive about this work. It’s interesting, promising, kind of unique and novel.”
You’re reading a preview, subscribe to read more.
Start your free 30 days