Watson System

44
Watson System By : Devendra Chaplot Priyank Chhipa Pratik Kumar

description

An artificially intelligent computer system capable of answering questions posed in natural language

Transcript of Watson System

Page 1: Watson System

Watson SystemBy :

Devendra Chaplot

Priyank Chhipa

Pratik Kumar

Page 2: Watson System

• ln((12,546,798 * π) ^ 2) / 34,567.46

=

What Computers Find Easier

0.00885

Page 3: Watson System

What Computers Find Easier

Owner Serial Number

David Jones 45322190-AK

Serial Number

Type Invoice #

45322190-AK LapTop INV10895

Invoice # Vendor Payment

INV10895 MyBuy $104.56

David Jones

David Jones =

Select Payment where Owner=“David Jones” and Type(Product)=“Laptop”,

Dave Jones

David Jones≠

Page 4: Watson System

Computer programs are natively explicit, fast and exacting in their calculation over numbers and symbols….But Natural Language is implicit, highly contextual, ambiguous and often imprecise.

• Where was X born?One day, from among his city views of Ulm, Otto chose a water color to

send to Albert Einstein as a remembrance of Einstein´s birthplace.

• X ran this?If leadership is an art then surely Jack Welch has proved himself a

master painter during his tenure at GE.

Person Birth Place

A.Einstein ULM

Person Organization

J. Welch GE

Structured

Unstructured

What Computers Find Hard

Page 5: Watson System

Capture the imagination– The Next Deep Blue

Engage the scientific community– Envision new ways for computers to impact society & science– Drive important and measurable scientific advances

Be Relevant to Important Problems– Enable better, faster decision making over unstructured and structured

content– Business Intelligence, Knowledge Discovery and Management,

Government, Compliance, Publishing, Legal, Healthcare, Business Integrity, Customer Relationship Management, Web Self-Service, Product Support, etc.

A Grand Challenge Opportunity

Page 6: Watson System

• Chess– A finite, mathematically well-defined search space

– Limited number of moves and states

– Grounded in explicit, unambiguous

mathematical rules

• Human Language– Ambiguous, contextual and implicit

– Grounded only in human cognition

– Seemingly infinite number of ways to express the same meaning

Real Language is Real Hard

Page 7: Watson System

A Long-Standing Challenge in Artificial Intelligence to emulate human expertise

Given– Rich Natural Language Questions– Over a Broad Domain of Knowledge

Deliver– Precise Answers: Determine what is being asked & give precise

response– Accurate Confidences: Determine likelihood answer is correct– Consumable Justifications: Explain why the answer is right– Fast Response Time: Precision & Confidence in <3 seconds

7

Automatic Open-Domain Question Answering

Page 8: Watson System

You may have heard of IBM’s Watson…

Why Jeopardy?

The game of Jeopardy! makes great demands on its players – from the range of topical knowledge covered to the nuances in language employed in the clues. The question IBM had for itself was“is it possible to build a computer system that could process big data and come up with sensible answers in seconds—so well that it could compete with human opponents?”

8

A. What is the computer system that played against human opponents on “Jeopardy”…and won.

Page 9: Watson System

• This fish was thought to be extinct millions of years ago until one was found off South Africa in 1938

• Category: ENDS IN "TH" • Answer:

• When hit by electrons, a phosphor gives off electromagnetic energy in this form

• Category: General Science• Answer:

• Secy. Chase just submitted this to me for the third time--guess what, pal. This time I'm accepting it

• Category: Lincoln Blogs• Answer:

The type of thing being asked for is

often indicated but can go from specific

to very vague

coelacanth

light (or photons)

his resignation

9

Some Basic Jeopardy! Clues

Page 10: Watson System

Lexical Answer Type

• We define a LAT to be a word in the clue that indicates the type of the answer, independent of assigning semantics to that word. For example in the following clue, the LAT is the string “maneuver.”– Category: Oooh….Chess

– Clue: Invented in the 1500s to speed up the game, this maneuver involves two pieces of the same color.

– Answer: Castling

Page 11: Watson System

Lexical Answer Type

• About 12 percent of the clues do not indicate an explicit lexical answer type but may refer to the answer with pronouns like “it,” “these,” or “this” or not refer to it at all. In these cases the type of answer must be inferred by the context.

Here’s an example:– Category: Decorating

– Clue: Though it sounds “harsh,” it’s just embroidery, often in a floral pattern, done with yarn on cotton cloth.

– Answer: crewel

Page 12: Watson System

Three types of knowledge

Domain Data

(articles, books,

documents)

Training and test question sets w/answer

keys

NLP Resources(vocabularies, taxonomies, ontologies)

Converted to Indices for search/passage lookup

Used to create logistic regression model that Watson uses for merging

scores

Named entity detection,

relationship detection

algorithms

Redirects extracted for disambiguation

Pseudo docs extracted for Candidate answer generation

Custom slot grammar parsers, prolog rules for

semantic analysis

Frame cuts generated with frequencies to determine

likely context

How we convert data into knowledge for Watson’s use

Page 13: Watson System

Machine learning• One of the core components of the system

– Multiple models

– 14000+ training questions

• Every candidate answer gets hundreds of features/scores associated with it. There features/scores are passed through previously trained ML model for candidate answer scoring

• It's not just one model. In fact there is a chain of models, each subsequent one utilizes scores produced by previously run models

• Machine learning also used in other parts of the system, such as LAT confidence analysis.

13

Page 14: Watson System

NLP• Used in many places (Question Analysis, Evidence Analysis,

Content Pre-processing)

• Combines both rule and statistic based approaches

• Full NLP stack (used in QA)– Tokenization

– Named Entity Recognition

– Deep Parsing and Predicate Argument Structure creation

– Lexical Answer Type (LAT) and Focus detection

– Anaphora resolution

– Semantic Relationships extraction

• Various technologies and techniques are used (English Slot Grammar parser, R2 NED, machine learning for LAT confidence analysis, custom annotators written in Prolog and Java)

Page 15: Watson System

NLP Examples

• LAT and Focus– It's the Peter Benchley novel about a killer

giant squid that menaces the coast of Bermuda

• Named Entity Recognition– It's the {Person::Peter Benchley} novel

about a killer giant {Animal::squid} that menaces the {Location::coast of Bermuda}

• Anaphora Resolution– Columbus embarked on his first voyage to

this continent in 1492. In the next two decades he led three more expeditions there.

Page 16: Watson System

NLP in evidence analysis and content pre-processing

• Why do NLP on evidence passages and ingested content?

• NLP in Evidence Analysis allows:– LAT based scoring

– Named entities alignment based scoring

• NLP in Content Pre-processing– Extracting and accumulating “knowledge” frames from the

content• For instance

• SVO frame cuts will contain frequencies of Subject-Verb-Object occurrences in the content that Watson has ingested.

• e.g squid menaces coast 809

– These “knowledge” frames are then used to generate candidate answers

Page 17: Watson System

he capital singer title there holiday son product site dog way form maker object heroine month countries0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

Our Focus is on reusable NLP technology for analyzing vast volumes of as-is text.

Structured sources (DBs and KBs) provide background knowledge for interpreting the text.

We do NOT attempt to anticipate all questions and build databases.

In a random sample of 20,000 questions we found2,500 distinct types*. The most frequent occurring <3% of the time. The

distribution has a very long tail.

And for each these types 1000’s of different things may be asked.

*13% are non-distinct (e.g, it, this, these or NA)

Even going for the head of the tail willbarely make a dent

We do NOT try to build a formal

model of the world

Broad Domain

Page 18: Watson System

Officials Submit Resignations (.7)People earn degrees at schools (0.9)

Inventors patent inventions (.8)

Volumes of Text Syntactic Frames Semantic Frames

Vessels Sink (0.7)People sink 8-balls (0.5) (in pool/0.8)

subject verb object

Sentence

ParsingGeneralization &

Statistical Aggregation

Fluid is a liquid (.6)Liquid is a fluid (.5)

Automatic Learning for “Reading”

Page 19: Watson System

Is(“Cytoplasm”, “liquid”) = 0.2

Is(“organelle”, “liquid”) = 0.1

In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus.

Is(“vacuole”, “liquid”) = 0.2

Is(“plasma”, “liquid”) = 0.7

“Cytoplasm is a fluid surrounding the nucleus…”

Wordnet Is_a(Fluid, Liquid) ?

Learned Is_a(Fluid, Liquid) yes.

Organelle Vacuole Cytoplasm Plasma Mitochondria Blood …

Many candidate answers (CAs) are generated from many different searches

Each possibility is evaluated according to different dimensions of evidence.

Just One piece of evidence is if the CA is of the right type. In this case a “liquid”.

Evaluating Possibilities and Their Evidences

Page 20: Watson System

celebrated

India

In May 1898

400th anniversary

arrival in

Portugal

India

In May

Garyexplorer

celebrated

anniversary

in Portugal

Keyword Matching

Keyword Matching

Keyword Matching

Keyword Matching

Keyword Matching

arrived in

In May, Gary arrived in India after he celebrated his anniversary in Portugal.

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

Evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence

Different Types of Evidence: Keyword Evidence

Page 21: Watson System

On 27th May 1498, Vasco da Gama landed in Kappad Beach

On 27th May 1498, Vasco da Gama landed in Kappad Beach

celebrated

May 1898 400th anniversary

arrival in

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

Portugal

landed in

27th May 1498

Vasco da Gama

Temporal Reasoning

Statistical Paraphrasing

GeoSpatial Reasoning

explorer

On 27th May 1498, Vasco da Gama landed in Kappad Beach

On the 27th of May 1498, Vasco da Gama landed in Kappad Beach

Kappad Beach

Para-phrases

Geo-KB

DateMath

India

Stronger evidence can be much harder to find and score.

The evidence is still not 100% certain.

Search Far and Wide

Explore many hypotheses

Find Judge Evidence

Many inference algorithms

Different Types of Evidence: Deeper Evidence

Page 22: Watson System

DeepQAThe technology & architecture behind Watson

Page 23: Watson System

The Difference Between Search & DeepQA

Decision Maker

Search EngineFinds Documents containing

Keywords

Delivers Documents based on Popularity

Has Question

Distills to 2-3 Keywords

Reads Documents, Finds Answers

Finds & Analyzes Evidence

ExpertUnderstands Question

Produces Possible Answers & Evidence

Delivers Response, Evidence & Confidence

Analyzes Evidence, Computes Confidence

Asks NL Question

Considers Answer & Evidence

Decision Maker

Page 24: Watson System

DeepQA: the technology & architecture behind Watson

InitialQuestion

HypothesisGeneration

Hypothesis & Evidence

Scoring

Final Confidence Merging & RankingSynthesis

Question & Topic Analysis

HypothesisGeneration

Hypothesis and Evidence Scoring

Learned Modelshelp combine and

weigh the Evidence

Evidence Sources

Answer Scoring

Deep Evidence Scoring

EvidenceRetrieval

Answer Sources

PrimarySearch

CandidateAnswer

Generation

QuestionDecomposition

HypothesisGeneration

Hypothesis and Evidence Scoring

model

model

model

model

model

model

model

model

model

Answer &Confidence

Page 25: Watson System

InitialQuestion

Question & Topic Analysis

QuestionDecomposition

Initial Question Formulated: “The name of this monetary

unit comes from the word for "round"; earlier coins were

often oval”

1

It decides whether the question needs to be subdivided.

3

Watson performs question analysis, determines what is

being asked.2

DeepQA: the technology & architecture behind Watson

Page 26: Watson System

InitialQuestio

n

Hypothesis

Generation

Question &

Topic Analysis

Hypothesis

Generation

QuestionDecompositio

n

Answer Sources

PrimarySearch

CandidateAnswer

Generation

Hypothesis

Generation

5In creating the

hypotheses it will use, Watson consults

numerous sources for potential answers…

Watson then starts to generate

hypotheses based on decomposition

and initial analysis…as many hypothesis as may be relevant

to the initial question…

4

DeepQA: the technology & architecture behind Watson

Page 27: Watson System

DeepQA: the technology & architecture behind Watson

InitialQuestio

n

Hypothesis

Generation

Hypothesis &

Evidence Scoring

Synthesis

Question &

Topic Analysis

Hypothesis and Evidence Scoring

Answer Sources

PrimarySearch

CandidateAnswer

Generation

QuestionDecompositio

n

Hypothesis and Evidence Scoring

Evidence Sources

Answer Scoring

Deep Evidence Scoring

EvidenceRetrieval

7Watson uses

Evidence Sources to validate it’s

hypothesis and help score the

potential answers

If the question was decomposed,

Watson brings

together hypotheses from sub-

parts

8

Watson then uses algorithms to “score” each

potential answer and assign a confidence

to that answer…

6

Page 28: Watson System

InitialQuestio

n

DeepQA: the technology & architecture behind Watson

Hypothesis & Evidence

Scoring

Final Confidence Merging & RankingSynthesis

HypothesisGeneration

HypothesisGeneration

Learned Modelshelp combine and

weigh the Evidence

model

model

model

model

model

model

model

model

model

Answer &Confidence

HypothesisGeneration

Question & Topic Analysis

Answer Sources

PrimarySearch

CandidateAnswer

Generation

QuestionDecomposition

InitialQuestion

Using models on the merged hypotheses, Watson can

weigh evidence based on prior “experiences”

9

Once Watson has ranked its answers, it

then provides its answers as well as the

confidence it has in each answer.

10

Page 29: Watson System

InitialQuestio

n

DeepQA: the technology & architecture behind Watson

InitialQuestion

HypothesisGeneration

Hypothesis & Evidence

Scoring

Final Confidence Merging & RankingSynthesis

Question & Topic Analysis

HypothesisGeneration

Hypothesis and Evidence Scoring

Learned Modelshelp combine and

weigh the Evidence

Evidence Sources

Answer Scoring

Deep Evidence Scoring

EvidenceRetrieval

Answer Sources

PrimarySearch

CandidateAnswer

Generation

QuestionDecomposition

HypothesisGeneration

Hypothesis and Evidence Scoring

model

model

model

model

model

model

model

model

model

Answer &Confidence

Page 30: Watson System

Step 0 : Content Acquisition

• Content acquisition is a combination of manual and automatic steps.

• The first step is to analyze example questions from the problem space to produce a description of the kinds of questions that must be answered and a characterization of the application domain.

• Analyzing example questions is primarily a manual task, while domain analysis may be informed by automatic or statistical analyses, such as the LAT analysis.

Page 31: Watson System

Step 1 : Question Analysis

The system attempts to understandwhat the question is asking and performs the initial analyses that determine how the question will be processed by the rest of the system.

• Question Classification e.g. puzzle/math

• Focus and Lexical Answer Type (LAT) e.g. “On this day” LAT – date/day

• Relation Detection e.g. sea (India, x, west)

• Decomposition - divide and conquer.

InitialQuestio

n

InitialQuestion

Question & Topic Analysis

QuestionDecomposition

Page 32: Watson System

Step 2 : Hypothesis Generation

1. Primary search :– Keyword based search

– Top 250 results are considered for CandidateAnswer generation.

– Empirical statistics : 85% time answer is withintop 250 results.

2. CA generation : generates CAs using results ofPrimary Search

3. Soft Filtering– lightweight (less resource intensive) scoring algorithms to a larger

set of initial candidates to prune them down to a smaller set of candidates

– Reduction in number of CA to approx. 100

– Answers are not fully discarded , may be reconsidered at final stage.

HypothesisGeneration

HypothesisGeneration

Answer Sources

PrimarySearch

CandidateAnswer

Generation

HypothesisGeneration

QuestionDecomposition

Page 33: Watson System

Step 2 : Hypothesis Generation

4. Each CA plugged back into the question is considered a hypothesis which the system has to prove correct with some threshold of confidence.

5. If failed at this state , system has no hope of answering the question whatsoever. – Noise tolerance - tolerate noise in the early stages

of the pipeline and drive up precision downstream

– Favors recall over precision, with the expectation that the rest of the processing pipeline will tease out the correct answer, even if the set of candidates is quite large

Page 34: Watson System

Step 3 : Hypothesis & Evidence scoring• Candidate answers that pass the soft filtering

threshold undergo a rigorous evaluation process that involves 2 steps :-

1. Evidence retrieval :– Gathers additional supporting evidence for each

candidate answer, or hypothesis.

e.g. Passage search: gathering passages by adding CA to primary search query.

2. Scoring:– Deep content analysis – includes many different

components, or scorers, that consider different dimensions of the evidence

– Produce a score that corresponds to how well evidence supports a candidate answer for a given question.

Page 35: Watson System

Step 4 : Final Merging and Ranking

• Merging:

– Multiple candidate answers for a question may be equivalent despite very different surface forms.

– Using an ensemble of matching, normalization and co-reference resolution algorithms, Watson identifies equivalent and related hypothesis.

– Without merging, ranking algorithms would be comparing multiple surface forms that represent the same answer and trying to discriminate among them.

Page 36: Watson System

Step 4 : Final Merging and Ranking

• Ranking and confidence estimation:– After merging, the system must rank the

hypotheses and estimate confidence based on their merged scores

– These hypothese are ran over set of training questions with known answers.

– Watson’s metalearner uses multiple trained models to handle different question classes as, for instance, certain scores that may be crucial to iden- tifying the correct answer for a factoid question may not be as useful on puzzle questions

Page 37: Watson System

38

The Final Blow! (ctd.)

“I for one welcome our new computer overlords” - Jennings

Page 38: Watson System

Watson – a Workload Optimized System• 90 x IBM Power 7501 servers

• 2880 POWER7 cores

• POWER7 3.55 GHz chip

• 500 GB per sec on-chip bandwidth

• 10 Gb Ethernet network

• 15 Terabytes of memory

• 20 Terabytes of disk, clustered

• Can operate at 80 Teraflops

• Runs IBM DeepQA software

• Scales out with and searches vast amounts of unstructured information with UIMA & Hadoop open source components

• Linux provides a scalable, open platform, optimized to exploit POWER7 performance

• 10 racks include servers, networking, shared disk system, cluster controllers1 Note that the Power 750 featuring POWER7 is a commercially available

server that runs AIX, IBM i and Linux and has been in market since Feb 2010

Page 39: Watson System

Watson: Precision, Confidence & Speed• Deep Analytics – We achieved champion-levels of

Precision and Confidence over a huge variety of expression

• Speed – By optimizing Watson’s computation for Jeopardy! on 2,880 POWER7 processing cores we went from 2 hours per question on a single CPU to an average of just 3 seconds – fast enough to compete with the best.

• Results – in 55 real-time sparring against former Tournament of Champion Players last year, Watson put on a very competitive performance, winning 71%. In the final Exhibition Match against Ken Jennings and Brad Rutter, Watson won!

Page 40: Watson System

Potential Business Applications

41

Automotive Quality Insight• Analyzing: Tech notes, call logs, online media• For: Warranty Analysis, Quality Assurance• Benefits: Reduce warranty costs, improve

customer satisfaction, marketing campaigns

Crime Analytics• Analyzing: Case files, police records, 911 calls…• For: Rapid crime solving & crime trend analysis• Benefits: Safer communities & optimized force

deployment

Healthcare Analytics• Analyzing: E-Medical records, hospital reports• For: Clinical analysis; treatment protocol

optimization• Benefits: Better management of chronic

diseases; optimized drug formularies; improved patient outcomes

Insurance Fraud• Analyzing: Insurance claims• For: Detecting Fraudulent activity & patterns• Benefits: Reduced losses, faster detection,

more efficient claims processes

Customer Care• Analyzing: Call center logs, emails, online media• For: Buyer Behavior, Churn prediction• Benefits: Improve Customer satisfaction and

retention, marketing campaigns, find new revenue opportunities

Social Media for Marketing• Analyzing: Call center notes, SharePoint,

multiple content repositories• For: churn prediction, product/brand quality • Benefits: Improve consumer satisfaction,

marketing campaigns, find new revenue opportunities or product/brand quality issues

Page 41: Watson System

References• The AI magazine

– Ferrucci, David, et al. "Building Watson: An overview of the DeepQA project." AI magazine 31.3 (2010): 59-79.

• Watson Systems: – http://www-03.ibm.com/innovation/us/watson/

• Wiki Page– http://en.wikipedia.org/wiki/Watson_%28computer%2

• Building Watson A Brief Overview of the DeepQA Project by Joel Farrell, IBM– http://www.medbiq.org/sites/default/files/presentation

s/2011/Farrell.ppt

Page 42: Watson System

References• What is Watson, really?

– http://www-01.ibm.com/software/ebusiness/jstart/downloads/IOD2011.ppt

– Authors : Keyur Dalal (IBM), Vladimir Stemkovski (IBM) and Jeff Sumner (IBM)

• Jeopardy! IBM Watson Day 1 (Feb 14, 2011)– http://

www.youtube.com/watch?v=seNkjYyG3gI&feature=related

• Science Behind an Answer-– http://

www-03.ibm.com/innovation/us/watson/what-is-watson/science-behind-an-answer.html

– Video:  http://youtu.be/DywO4zksfXw

Page 43: Watson System

Questions?

Page 44: Watson System

Thank You!