Introduction
In approximately 2007, IBM (International Business Machines) researchers set out to create a computer system that could compete with humans on the popular television game show Jeopardy (Clive, 2010). Most search engines, like Bing or Google, generate a list of sources from a keywords search. The sources direct users to webpages that may or may not contain the information sought by the user. IBM researchers aimed to develop a question and answering system that could communicate with humans in what computer scientist call “natural language”, which is, in everyday human dialect (Clive, 2010). This would require the system to produce the answer to a posed question, rather than just directing the user to possible answers …show more content…
al, 2010). The first round of the game consists of a grid of 6 columns and 5 rows with different categories and various dollar amounts, from which the contestants will select. The returning champion will be the first contestant to select a category. When the contestant makes a selection, a clue is revealed and the host reads the clue aloud. Upon completion of the clue reading, each contestant has the opportunity to buzz in with a hand held device, indicating he or she would like provide a response for the clue. Once a contestant buzzes in, they have five seconds to provide their response, which must be stated in the form of a question. An example of a clue might be; “A highly regarded learning institution in the state of Alabama,” for which the correct response would be; “What is Athens State University?” A correct response; and the player gets the value matching the posed question added to their total earnings, and allow that player gets to select the next category; whereas, an incorrect response results in a deduction from the contestants’ earnings (Ferrucci, et. al, …show more content…
al, 2010). Watson was not connected to the internet during the competition (Clive, 2010). Researchers spent years feeding Watson information that is stored in its memory. During the game, the questions were input manually in text format for Watson, while simultaneously reading the clues aloud to the other contestants. Watson was programmed with a machine-synthesized voice that was heard from a speaker on set (Clive, 2010).
Rather than focusing on whether or not Watson could defeat a human player, it was more relevant to determine Watson’s correctness, confidence and speed. To accomplish this, the team gathered data during a full game of Jeopardy. They recorded the percentage of correct responses from Watson, as well as the precision of its responses. Watson is equipped with a confidence threshold that when set higher, would answer a higher percentage of questions with a lessor amount of precision and when set at a lower confidence threshold, answers a lower percentage of questions with a higher amount of precision. The accuracy of Watson was also measured based on the amount of correct responses produced, even when it was not the first to buzz in. The analysis concluded that if set at the highest level of confidence in its answers, Watson would get 80% correct while answering 50% of the questions. A lower