The headlines have been nearly ecstatic–and almost uniformly uncritical. Here are just a few:
- The Verge: Computer passes Turing Test for first time by convincing judges it is a 13-year-old boy
- Venture Beat: Talk to the computer that passed the Turing Test, a historic artificial intelligence milestone
- Yahoo Tech: Turing Test Bested, Robot Overlords Creep Closer
- NBC News: Turing Test: Computer Program Convinces Judges It’s Human
- Washington Post: A computer just passed the Turing Test in landmark trial
- The Independent: Turing Test breakthrough as super-computer becomes first to convince us it’s human
As several commentators have pointed out, the “victory” is pretty dubious. Mike Masnick at TechDirt was quick to question the alleged result, listing several important points that call it into question:
- It’s not a “supercomputer,” it’s a chatbot. It’s a script made to mimic human conversation. There is no intelligence, artificial or not involved. It’s just a chatbot.
- Plenty of other chatbots have similarly claimed to have “passed” the Turing test in the past (often with higher ratings). Here’s a story from three years ago about another bot, Cleverbot, “passing” the Turing Test by convincing 59% of judges it was human (much higher than the 33% Eugene Goostman) claims.
- It “beat” the Turing test here by “gaming” the rules — by telling people the computer was a13-year-old boy from Ukraine in order to mentally explain away odd responses.
- The “rules” of the Turing test always seem to change. Hell, Turing’s original test was quite different anyway.
- As Chris Dixon points out, you don’t get to run a single test with judges that you picked and declare you accomplished something. That’s just not how it’s done. If someone claimed to have created nuclear fusion or cured cancer, you’d wait for some peer review and repeat tests under other circumstances before buying it, right?
- The whole concept of the Turing Test itself is kind of a joke. While it’s fun to think about, creating a chatbot that can fool humans is not really the same thing as creating artificial intelligence. Many in the AI world look on the Turing Test as a needless distraction.
I personally think that the test still has an important place in our thinking about artificial intelligence, but there’s certainly nothing wrong with questioning its value–in science there is no such thing as unquestionable canon, after all–and Masnick’s other points are pretty much on the money.
The Guardian also ran with the original dramatic story (“Computer simulating 13-year-old boy becomes first to pass Turing test“), prompting a number of comments taking it to task, which it had the good sense to publish as well (“Claims that the Turing test has been passed are nonsense“).
For instance, Professor Robert Epstein of the American Institute for Behavioral Research and Technology (bio here) wrote:
Professor Warwick’s claim that a computer has now passed the Turing Test […] is nonsense. Turing never set a 30% mark as a criterion for “passing” his test. In his famous essay on this topic, which is reprinted with commentaries in my book, Parsing the Turing Test: Methodological and Philosophical Issues in the Quest for the Thinking Computer, Turing merely conjectured that by 2000 a computer program would be able to fool an “average interrogator” into thinking it was a person 30% of the time in a five-minute conversation. He didn’t propose that as a test of anything; he was merely speculating.
Turing never actually said how his test could actually be passed, but a blue ribbon panel of computer scientists and philosophers from Harvard, MIT, and elsewhere which I directed for several years in planning the first Loebner Prize contest in 1990, came up with with a brilliant method that I am sure would have pleased Turing greatly: after lengthy conversations with both hidden humans and hidden computers, a panel ranks the humanness of each, and when the median rank of a computer exceeds the median rank of a human, it wins. No computer has ever crossed that line in the more than 20 years the contest has so far been held, but it will happen eventually.
Of course the Turing Test hasn’t been passed. I think its a great shame it has been reported that way, because it reduces the worth of serious AI research. We are still a very long way from achieving human-level AI, and it trivialises Turing’s thought experiment (which is fraught with problems anyway) to suggest otherwise.
And The Verge, which reported the alleged “pass,” also reported on some skeptical reactions (“Google futurist Ray Kurzweil and other experts say chatbot didn’t pass Turing Test“), including one from Ray Kurzweil:
I chatted with the chatbot Eugene Goostman, and was not impressed. Eugene does not keep track of the conversation, repeats himself word for word, and often responds with typical chatbot non sequiturs.
Others echoed his distrust of the hyped announcement:
New York University cognitive science professor Gary Marcus agrees, writing in The New Yorker that the test wasn’t taken by “innovative hardware but simply a cleverly coded piece of software.” Marcus writes that the chatbot often resorts to misdirecting the person it’s speaking with using humor so that it can avoid questions that it doesn’t understand. “It’s easy to see how an untrained judge might mistake wit for reality, but once you have an understanding of how this sort of system works, the constant misdirection and deflection becomes obvious, even irritating,” Marcus writes. “The illusion, in other words, is fleeting.”
Marc Andreessen, co-founder of Netscape and one of today’s biggest names in tech investing, isn’t taking much stock in the claims over this chatbot either. “My view is that [the] Turing Test has always been malformed,” he writes on Twitter. “Humans are too easy to trick, passing [the] test says almost nothing about software.”
Marcus gave a sample of the chatbot’s chatting in his New Yorker piece (“What Comes After the Turing Test“):
Marcus: Do you read The New Yorker?
Goostman: I read a lot of books … So many—I don’t even remember which ones.
Marcus: You remind me of Sarah Palin.
Goostman: If I’m not mistaken, Sarah is a robot, just as many other “people.” We must destroy the plans of these talking trash cans!
At least in this small sample, it doesn’t seem distinguishable from other chatbots I’ve seen.
As much as I would enjoy the drama of seeing the Turing test actually passed, a little more critical thought would have made for less hyped, more accurate reporting.
If the horizon is populated by terminators, it’s a ways off yet.