I’d heard of this little demonstration of computing power some time ago; last week I caught a documentary about it on PBS’s NOVA program. On the way into work today, I also heard a more up-to-date report of it on NPR’s Morning Edition. The event itself takes place tonight through Wednesday, during the regular Jeopardy! broadcast time slot.
One of my favorite moments in the NOVA program: The Watson project leader at IBM, David Ferrucci, was for a while quite discouraged with Watson’s performance in dry runs. He’d invited his two small children to the set to watch one of these earlier tests. Onstage, Watson’s screen was set up as in the above video, between two human competitors. The part of the “host” was played by a comedian. Every time Watson got a question wrong (which happened many times during that stretch), the host laughed and made a wisecrack. Because, y’know, the wrong answers were often surreally wrong. People in the audience and Watson’s competitors always laughed at the host’s commentary.
What did Ferrucci’s kids take away from the experience?
Not that they’d witnessed something important, an historic event.
Not that machines aren’t as “smart” as humans, nor even as “smart” as their advocates claim.
To my knowledge, they didn’t remember — all kid-like — something completely irrelevant like the vending machines.
No, what they got was emotional confusion:
Why was that man picking on Watson, Daddy? Why was he making fun of him? And why was everyone else laughing and applauding with the man? Didn’t that hurt Watson’s feelings?
The highway of electronics history is, as they say, littered with the road-kill of assertions that thus-and-such task will never be successfully performed by a computer. So I lay no bets on the outcome of this Jeopardy! challenge.
________________________________
* Note that this can refer to human hubris, either on the part of Watson’s designers and builders, or — to the contrary — on the part of everyone who thinks this will validate the conventional wisdom: that a computer cannot out-“think” a human on any task requiring natural-language processing.
More interesting, maybe, to wonder: Will it someday refer to machine hubris?
CS says
I can’t say that I’ll watch it but I will be interested in the outcome. In the end, it’s really a matter of programming.
John says
CS: True.
And I agree it’ll be fascinating to see it in action, even just the accounts of it. I wish only that we could see inside the machine’s “thought processes,” not just have to satisfy ourselves with the outcome. My understanding is that Watson has learned many things which it wasn’t programmed (or pre-loaded with data) to know.
(The NPR story reported one test question, “What do grasshoppers eat?” and Watson’s response, “Kosher.” It made a sort of sense, if you knew that grasshoppers are considered a kosher food… even if the nature of the “sense” was utterly mangled.)
DarcKnyt says
I know nothing of this, so it’s sort of news to me. I’ll be interested to see how it turns out.
Froog says
Or has ‘Watson’ been listening to the comedian host too much? Kosher is such a brilliant – dada-ist- kind of joke!
I just worry if he (she/it…. er) gets trashed in the contest, he’ll start singing ‘Daisy, Daisy’.
John says
Darc: I’ll try to post morning-after reports here in the comments.
John says
Froog: Yes, that’s perfect — Watson’s mistakes sometimes have the character of dadaist one-liners. After watching “him” in action last night, I found myself eager for the next wrong answer; unraveling why Watson came up with Answer A instead of B became (for me) the best entertainment.
John says
Update, 2011-02-15:
Watson is competing against Brad Rutter, who’s won more money at Jeopardy! than anyone else, and Ken Jennings, who’s had the longest winning streak. Watson and Rutter are currently tied, and Jennings — who seemed to be having a problem with his “I know it!” button early on — is trailing by a couple thousand dollars.
About that button: I wondered how they would get around that. I’d thought that Watson would just send an electronic signal directly to the system which locks other competitors out, which seemed hardly fair. It turns out that they’ve rigged up for him a sort of plunger/button system which emulates the action of a thumb pushing down on a button, and part of Watson’s program activates this plunger when he decides to buzz in on an answer.
(Of course, as The Missus pointed out, it’s still not “fair”: Watson pushes his button exactly the same way every time, so that the plunger travels exactly the same distance, with exactly the same amount of force, over and over. The human competitors have to get a signal from brain to thumb, and make the thumb move without hesitation. Very hard to do repeatedly, without changing the angle of approach and so on.)
Watson’s program comes up with what he thinks are the three responses most likely to be correct, and each one is assigned a probability of being right (expressed as a percentage). If none of the candidate responses gets a probability over 50%, Watson will not attempt to buzz in. (This has provided the window of opportunity for the two human contestants to do most of their scoring.)
Interestingly, Watson seems to know nothing of the other contestants’ responses. In one case, one of the humans responded incorrectly, and Watson jumped right in… with exactly the same incorrect response. I don’t know how, or even if, they can fix this for the time being: Watson has no audio or video input system. (He receives each answer displayed on the board as a string of text, transmitted to him at the same time that Alex Trebek reads it aloud.)
As recounted at the Ars Technica blog, and elsewhere, Watson almost snuck an incorrect response past Alex:
The show was broken up by little videos in which Alex explained how Watson works, showed us the off-camera room where the system itself is housed, and so on. Thus they showed only about half an episode, through the end of the first round.
Tonight’s show continues with the “Double Jeopardy!” round. By the end of the show tomorrow night, the three contestants will supposedly have completed two full games (presumably including at least one “Final Jeopardy!” answer/question round).
marta says
I’m not going watch when they have Mac compete against PC.
But what I wonder is what a generation learns about interactions with others, kindness, mockery, and the like with the computers added to the emotional quagmire that is life.
John says
Update, 2011-02-16:
Perhaps it’s premature to say so, but this is turning into a rout. As of the end of the program last night, the contestants’ earnings were:
The buzz-in advantage which I mentioned in yesterday’s update, above, seemed to be even more pronounced in this round.
However, I think I noticed what was going on when either of the two humans beat Watson to the buzzer — even when Watson was highly confident of the correct response. It seemed to me that they did well with shorter clues, and clues constructed with the most straightforward syntax. My theory: they’re reading ahead of Alex’s reading of the clue, only at the end of which is the clue’s text transmitted to Watson. That imperceptible split-second gives them just enough time to push the plunger.
The other attention-getting detail in last night’s round was Watson’s betting — both on the Daily Doubles and in the Final Jeopardy round. When he got the first Daily Double, with earnings at that point of $14,600, he bet $6,435 (to which Alex said, dryly, “I won’t ask…”); on the second, $1,246; and on Final Jeopardy, only $947.
On that Final Jeopardy question, Watson was an utter dud. The category was US cities; the clue, something like, “This city has two major airports, one named after a World War II hero and the other named after a World War II battle.” Both guys got it right — Chicago (the airports O’Hare and Midway, respectively). Watson apparently had a very low confidence level and followed his incorrect response with a series of question marks: Toronto???? But with such a low bet, and given that the gap between him and Rutter was then something like $25,000, it really didn’t hurt much.
During one of the documentary moments in last night’s show, the IBM team was waxing euphoric about what a profound impact Watson might have on important, complex issues of our time, like, oh, say, health care. One guy said something about when a doctor is uncertain of his diagnosis or suggested treatment, s/he could turn to Watson for help. The Missus and I both wondered if we’d want to be such a doctor’s patient at the moment Watson had one of those “Toronto????” hiccups.
Tonight’s show features a complete game. Whichever contestant is ahead in total earnings, including Monday’s and last night’s results, will be declared “the winner.”
John says
Update, 2011-02-17: Watson won, to no one’s very great surprise. Final tally:
However, this game was much more interesting during the regular Jeopardy round — for a time, Ken Jennings was actually ahead of Watson. (For what seemed like maybe four-five-six questions, Watson’s confidence level wasn’t high enough for him to buzz in at all.) Unfortunately, then came Double and Final Jeopardy…
The whole thing was apparently filmed a month ago. Afterwards, Jennings wrote a piece for Slate, reflecting on the experience; that piece showed up at the Slate site late last night. He reveals himself to be a man of good humor and good writing:
Other good recaps of the final night of the “contest”:
I especially liked Seabrook’s report of “the P300 response.” He quotes a 2008 contestant on the show (my emphasis added):