The Easy Questions That Stump Computers
What happens when you stack kindling and logs in a fireplace and then drop some matches is that you typically start a …
Surely a system smart enough to contribute to The New Yorker would have no trouble completing the sentence with the obvious word, fire. GPT-2 responded with ick. In another attempt, it suggested that dropping matches on logs in a fireplace would start an “irc channel full of people.”
Marcus wasn’t surprised. Commonsense reasoning—the ability to make mundane inferences using basic knowledge about the world, like the fact that “matches” plus “logs” usually equals “fire”—has resisted AI researchers’ efforts for decades. Marcus posted the exchanges to his Twitter account with his own added commentary: “LMAO,” internet slang for a derisive chortle. Neural networks might be impressive linguistic mimics, but they clearly lack basic common sense.
Minutes later, Yejin Choi saw Marcus’s snarky tweet. The timing was awkward. Within the hour, Choi was scheduled to give a talk at a prominent AI conference on her latest research project: a system, nicknamed COMET, that was designed to use an earlier version of GPT-2 to perform commonsense reasoning.
[Read: How a pioneer of machine learning became one of its sharpest critics]
Quickly, Choi—a senior research manager at the Allen Institute for AI in Seattle, who describes herself as an “adventurer at heart”—fed COMET the same prompt Marcus had used (with its wording slightly modified to match COMET’s input format):
Gary stacks kindling and logs and drops some matches.
COMET generated 10 inferences about why Gary might be dropping the matches. Not all of the responses made sense, but the first two did: He “wanted to start a fire” or “to and strode up to the podium to include them in her presentation. “It seemed only appropriate,” she said.
You’re reading a preview, subscribe to read more.
Start your free 30 days