Department of Computer Science The University of Texas at Austin Probabilistic Abduction using...

Department of Computer Science The University of Texas at Austin

Probabilistic Abduction using Markov Logic Networks

Rohit J. Kate Raymond J. Mooney

2

Abduction

• Abduction is inference to the best explanation for a given set of evidence

• Applications include tasks in which observations need to be explained by the best hypothesis– Plan recognition

– Intent recognition

– Medical diagnosis

– Fault diagnosis

…

• Most previous work falls under two frameworks for abduction– First-order logic based Abduction

– Probabilistic abduction using Bayesian networks

3

Logical Abduction

Given:• Background knowledge, B, in the form of a set of (Horn)

clauses in first-order logic • Observations, O, in the form of atomic facts in first-order

logicFind:• A hypothesis, H, a set of assumptions (logical formulae)

that logically entail the observations given the theory BH O• Typically, best explanation is the one with the fewest

assumptions, e.g. minimizes |H|

4

Sample Logical Abduction Problem

• Background Knowledge:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) →

Infected(y,Malaria))

x y (Infected(x,Malaria) Transfuse(Blood,x,y) →


• Observations:Infected(John, Malaria)

Transfuse(Blood, Mary, John)

• Explanation:Infected(Mary, Malaria)

5

Previous Work in Logical Abduction

• Several first-order logic based approaches [Poole et al. 1987; Stickel 1988; Ng & Mooney 1991; Kakas et al. 1993]

• Perform first-order “backward” logical reasoning to determine the set of assumptions sufficient to deduce observations

• Unable to reason under uncertaintyto find the most probable explanation

Background Knowledge:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria))Holds 80% of the timesx y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria))Holds 40% of the times

Observation:Infected(John, Malaria) 99% sureTransfuse(Blood, Mary, John) 60% sure

6

Previous Work in Probabilistic Abduction

• An alternate framework is based on Bayesian networks [Pearl 1988]

• Uncertainties are encoded in a directed graph• Given a set of observations, probabilistic inference

over the graph computes the posterior probabilities of explanations

• Unable to handle structured representations because essentially based on propositional logic

7

Probabilistic Abduction using MLNs

• We present a new approach for probabilistic abduction that combines first-order logic and probabilistic graphical models

• Uses Markov Logic Networks (MLNs) [Richardson and

Domingos 2006], a theoretically sound framework for combining first-order logic and probabilistic graphical models

Rest of the talk:– MLNs– Our approach using MLNs– Experiments– Future Work and Conclusions

8

Markov Logic Networks (MLNs) [Richardson and Domingos 2006]

• A logical knowledge base is a set of hard constraints on the set of possible worlds

• An MLN is a set of soft constraints:When a world violates a clause, it becomes less probable, not impossible

• Give each clause a weight(Higher weight Stronger constraint)

€

P(world)∝ exp weights of clauses it satisfies∑( )

9

Sample MLN Clauses

x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20

x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5

10

MLN Probabilistic Model

• MLN is a template for constructing a Markov network– Ground literals correspond to nodes

– Ground clauses correspond to cliques connecting the ground literals in the clause

• Probability of a world (truth assignments) x:

Weight of clause i No. of true groundings of clause i in x

⎟⎠

⎞⎜⎝

⎛= ∑

iii xnw

ZxP )(exp

1)(

11

Sample MLN Probabilistic Model

• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:

Mosquito(M) Infected(M,Malaria) Bite(M,John) Bite(M,Mary) Infected(John,Malaria) Infected(Mary,Malaria) Transfuse(Blood,John,Mary) Transfuse(Blood,Mary,John)

12


• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:

Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false

13


• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:


14


• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:


15


• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M• Ground literals:


16




17




€

P( ) =1

Zexp 20*2 + 5 *2( )

18

MLNs Inference and Learning

• Using probabilistic inference techniques one can determine the most probable truth assignment, probability that a clause holds etc.

• Given a database of training examples, appropriate weights of the formulae can be learned to maximize the probability of the training data

• An open-source software package for MLNs called Alchemy is available

19

Abduction using MLNs

• Given:Infected(Mary,Malaria) Transfuse(Blood,Mary,John) →

Infected(John,Malaria)) Transfuse(Blood, Mary, John)Infected(John, Malaria)

• The clause is satisfied whether Infected(Mary, Malaria) is true or false

• Given the observations, a world has the same probability in MLN whether the explanation is true or false, explanations cannot be inferred

• The MLN inference mechanism is inherently deductive and not abductive

20

Adapting MLNs for Abduction

• Explicitly include the reverse implicationsx y (Infected(x,Malaria) Transfuse(Blood,x,y) →


y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria)))

• Existentially quantify the universally quantified variables which appear on the LHS but not on the RHS in the original clause

• Now, given Transfuse(Blood, Mary, John) and Infected(John, Malaria), the probability of the world in which Infected(Mary,Malaria) is true will be higher

21


• However, there could be multiple explanations for the same observations:x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria))

y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria)))

x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria))

y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))

• An observation should be explained by one explanation and not multiple explanations

• The system should support “explaining away”[Pearl 1988]

22

x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria))


• Add the disjunction clause and the mutual exclusivity clause for the same RHS term

• Since MLN clauses are soft constraints both explanations can still be true

y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))y (Infected(y,Malaria) → ( x (Transfuse(Blood,x,y) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,y))))

23


• In general, for the Horn clauses P1 → Q, P2 → Q , …,

Pn → Q in the background knowledge base, add:

– A reverse implication disjunction clause

Q → P1 v P2 v… v Pn

– A mutual exclusivity clause for every pair of explanations

Q → P1 v P2

Q → P1 v Pn

…

Q → P2 v Pn

• Weights can be learned from training examples or can be set heuristically

24


• There could be constants or variables on the RHS predicate

x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria))

25




Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John))Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John)))

26






27






• Formal algorithm is described in the paper, requires appropriate unifications and variable re-namings

28

Experiments: Dataset

• Plan recognition dataset used to evaluate abductive systems [Ng & Mooney 1991; Charniak & Goldman 1991]

• Character’s higher-level plans must be inferred to explain their observed actions in a narrative text– “Fred went to the supermarket. He pointed a gun at the

owner. He packed his bag.” => robbing

– “Jack went to the supermarket. He found some milk on the shelf. He paid for it.” => shopping

• Dataset contains 25 development [Goldman 1990] and 25 test examples [Ng & Mooney 1992]

29

Experiments: Dataset contd.

• Background knowledge-base was constructed for the ACCEL system [Ng and Mooney 1991] to work with the 25 development examples; 107 such rulesinstance_shopping(s) ^ go_step(s,g) instance_going(g)instance_shopping(s) ^ go_step(s,g) ^ shopper(s,p) goer(g,p)

• Narrative text is represented in first order logic; average 12.6 literals– “Bill went to the store. He paid for some milk.”

instance_going(Go1) goer(Go1,Bill) destination_go(Store1) instance_paying(Pay1) payer(Pay1,Bill) thing_paid(Pay1,Milk1)

• Assumptions explaining the above actions instance_shopping(S1) shopper(S1,Bill) go_step(S1,Go1) pay_step(S1,Pay1) thing_shopped_for(S1,Milk)

30

Experiments: Methodology

• Our algorithm automatically adds clauses to the knowledge-base for performing abduction using MLNs

• We found that 25 development examples were too few to learn weights for MLNs, we heuristically set the weights– Small negative weights on unit clauses so that they are

not assumed for no reason– Medium weights on reverse implication clauses– Large weights on mutual exlcusivity clauses

• Given a set of observations, we use Alchemy’s probabilistic inference to determine the most likely truth assignment for the remaining literals

31

Experiments: Methodology contd.

• We compare with the ACCEL system [Ng & Mooney

1992], a purely logic-based system for abduction • Selects the best explanation using a metric

– Simplicity metric: selects the explanation of smallest size

– Coherence metric: selects the explanation that maximally connects the observations (specifically geared towards this task)

• “John took the bus. He bought milk.” => John took the bus to the store where he bought the milk.

32

Experiments: Methodology contd.

• Besides finding the assumptions, a deductive system like MLN also finds other facts that can be deduced from the assumptions

• We deductively expand ACCEL’s output and gold-standard answers for a fair comparison

• We measure – Precision: what fraction of the predicted ground literals

are in the gold-standard answers

– Recall: what fraction of the ground literals in the gold-standard answers were predicted

– F-measure: harmonic mean of precision and recall

33

Experiments: Results

70

75

80

85

90

95

100

F-measure Recall Precision

MLNACCEL-SimplicityACCEL-Coherence

Development Set

34

Experiments: Results contd.

70

75

80

85

90

95

100

F-measure Recall Precision

MLNACCEL-SimplicityACCEL-Coherence

Test Set

35

Experiments: Results contd.

• MLN performs better than ACCEL-simplicity particularly on the development set

• ACCEL-coherence performs the best, but was specifically tailored for narrative understanding task

• The dataset used does not require full probabilistic treatment because little uncertainty in the knowledge-base or observations

• MLNs did not need any heuristic metric but simply found the most probable explanation

36

Future Work

• Evaluate probabilistic abduction using MLNs on a task in which uncertainty plays a bigger role

• Evaluate on a larger dataset on which the weights could be learned to automatically adapt to a particular domain– Previous abductive systems like ACCEL have no

learning mechanism

• Perform probabilistic abduction using other frameworks of combining first-order logic and graphical models [Getoor & Taskar 2007], for example, Bayesian Logic Programming [Kersting & De Raedt

2001] and compare with the presented approach

37

Conclusions

• A general method for probabilistic first-order logical abduction using MLNs

• Existing off-the-shelf deductive inference system of MLNs is employed to do abduction by suitably reversing the implications

• Handles uncertainties using probabilties and an unbounded number of related entities using first-order logic, capable of learning

• Experiments on a small plan recognition dataset demonstrated that it compares favorably with special-purpose logic-based abductive systems

38

Thanks!

Questions?

Department of Computer Science The University of Texas at Austin Probabilistic Abduction using...

Documents

Transcript of Department of Computer Science The University of Texas at Austin Probabilistic Abduction using...