Department of Computer Science The University of Texas at Austin Probabilistic Abduction using...
-
Upload
june-phillips -
Category
Documents
-
view
216 -
download
4
Transcript of Department of Computer Science The University of Texas at Austin Probabilistic Abduction using...
Department of Computer Science The University of Texas at Austin
Probabilistic Abduction using Markov Logic Networks
Rohit J. Kate Raymond J. Mooney
2
Abduction
• Abduction is inference to the best explanation for a given set of evidence
• Applications include tasks in which observations need to be explained by the best hypothesis– Plan recognition
– Intent recognition
– Medical diagnosis
– Fault diagnosis
…
• Most previous work falls under two frameworks for abduction– First-order logic based Abduction
– Probabilistic abduction using Bayesian networks
3
Logical Abduction
Given:• Background knowledge, B, in the form of a set of (Horn)
clauses in first-order logic • Observations, O, in the form of atomic facts in first-order
logicFind:• A hypothesis, H, a set of assumptions (logical formulae)
that logically entail the observations given the theory BH O• Typically, best explanation is the one with the fewest
assumptions, e.g. minimizes |H|
4
Sample Logical Abduction Problem
• Background Knowledge:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) →
Infected(y,Malaria))
x y (Infected(x,Malaria) Transfuse(Blood,x,y) →
Infected(y,Malaria))
• Observations:Infected(John, Malaria)
Transfuse(Blood, Mary, John)
• Explanation:Infected(Mary, Malaria)
5
Previous Work in Logical Abduction
• Several first-order logic based approaches [Poole et al. 1987; Stickel 1988; Ng & Mooney 1991; Kakas et al. 1993]
• Perform first-order “backward” logical reasoning to determine the set of assumptions sufficient to deduce observations
• Unable to reason under uncertaintyto find the most probable explanation
Background Knowledge:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria))Holds 80% of the timesx y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria))Holds 40% of the times
Observation:Infected(John, Malaria) 99% sureTransfuse(Blood, Mary, John) 60% sure
6
Previous Work in Probabilistic Abduction
• An alternate framework is based on Bayesian networks [Pearl 1988]
• Uncertainties are encoded in a directed graph• Given a set of observations, probabilistic inference
over the graph computes the posterior probabilities of explanations
• Unable to handle structured representations because essentially based on propositional logic
7
Probabilistic Abduction using MLNs
• We present a new approach for probabilistic abduction that combines first-order logic and probabilistic graphical models
• Uses Markov Logic Networks (MLNs) [Richardson and
Domingos 2006], a theoretically sound framework for combining first-order logic and probabilistic graphical models
Rest of the talk:– MLNs– Our approach using MLNs– Experiments– Future Work and Conclusions
8
Markov Logic Networks (MLNs) [Richardson and Domingos 2006]
• A logical knowledge base is a set of hard constraints on the set of possible worlds
• An MLN is a set of soft constraints:When a world violates a clause, it becomes less probable, not impossible
• Give each clause a weight(Higher weight Stronger constraint)
€
P(world)∝ exp weights of clauses it satisfies∑( )
9
Sample MLN Clauses
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20
x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5
10
MLN Probabilistic Model
• MLN is a template for constructing a Markov network– Ground literals correspond to nodes
– Ground clauses correspond to cliques connecting the ground literals in the clause
• Probability of a world (truth assignments) x:
Weight of clause i No. of true groundings of clause i in x
⎟⎠
⎞⎜⎝
⎛= ∑
iii xnw
ZxP )(exp
1)(
11
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:
Mosquito(M) Infected(M,Malaria) Bite(M,John) Bite(M,Mary) Infected(John,Malaria) Infected(Mary,Malaria) Transfuse(Blood,John,Mary) Transfuse(Blood,Mary,John)
12
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:
Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false
13
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:
Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false
14
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5• Constants: John, Mary, M• Ground literals:
Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false
15
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M• Ground literals:
Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false
16
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M• Ground literals:
Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false
17
Sample MLN Probabilistic Model
• Clauses with weights:x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M• Ground literals:
Mosquito(M) trueInfected(M,Malaria) trueBite(M,John) falseBite(M,Mary) falseInfected(John,Malaria) trueInfected(Mary,Malaria) trueTransfuse(Blood,John,Mary) trueTransfuse(Blood,Mary,John) false
€
P( ) =1
Zexp 20*2 + 5 *2( )
18
MLNs Inference and Learning
• Using probabilistic inference techniques one can determine the most probable truth assignment, probability that a clause holds etc.
• Given a database of training examples, appropriate weights of the formulae can be learned to maximize the probability of the training data
• An open-source software package for MLNs called Alchemy is available
19
Abduction using MLNs
• Given:Infected(Mary,Malaria) Transfuse(Blood,Mary,John) →
Infected(John,Malaria)) Transfuse(Blood, Mary, John)Infected(John, Malaria)
• The clause is satisfied whether Infected(Mary, Malaria) is true or false
• Given the observations, a world has the same probability in MLN whether the explanation is true or false, explanations cannot be inferred
• The MLN inference mechanism is inherently deductive and not abductive
20
Adapting MLNs for Abduction
• Explicitly include the reverse implicationsx y (Infected(x,Malaria) Transfuse(Blood,x,y) →
Infected(y,Malaria))
y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria)))
• Existentially quantify the universally quantified variables which appear on the LHS but not on the RHS in the original clause
• Now, given Transfuse(Blood, Mary, John) and Infected(John, Malaria), the probability of the world in which Infected(Mary,Malaria) is true will be higher
21
Adapting MLNs for Abduction
• However, there could be multiple explanations for the same observations:x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria))
y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria)))
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria))
y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))
• An observation should be explained by one explanation and not multiple explanations
• The system should support “explaining away”[Pearl 1988]
22
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria))
Adapting MLNs for Abduction
• Add the disjunction clause and the mutual exclusivity clause for the same RHS term
• Since MLN clauses are soft constraints both explanations can still be true
y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))y (Infected(y,Malaria) → ( x (Transfuse(Blood,x,y) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,y))))
23
Adapting MLNs for Abduction
• In general, for the Horn clauses P1 → Q, P2 → Q , …,
Pn → Q in the background knowledge base, add:
– A reverse implication disjunction clause
Q → P1 v P2 v… v Pn
– A mutual exclusivity clause for every pair of explanations
Q → P1 v P2
Q → P1 v Pn
…
Q → P2 v Pn
• Weights can be learned from training examples or can be set heuristically
24
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria))
25
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria))
Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John))Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John)))
26
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria))
Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John))Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John)))
y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))
27
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria))
Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John))Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John)))
y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))
• Formal algorithm is described in the paper, requires appropriate unifications and variable re-namings
28
Experiments: Dataset
• Plan recognition dataset used to evaluate abductive systems [Ng & Mooney 1991; Charniak & Goldman 1991]
• Character’s higher-level plans must be inferred to explain their observed actions in a narrative text– “Fred went to the supermarket. He pointed a gun at the
owner. He packed his bag.” => robbing
– “Jack went to the supermarket. He found some milk on the shelf. He paid for it.” => shopping
• Dataset contains 25 development [Goldman 1990] and 25 test examples [Ng & Mooney 1992]
29
Experiments: Dataset contd.
• Background knowledge-base was constructed for the ACCEL system [Ng and Mooney 1991] to work with the 25 development examples; 107 such rulesinstance_shopping(s) ^ go_step(s,g) instance_going(g)instance_shopping(s) ^ go_step(s,g) ^ shopper(s,p) goer(g,p)
• Narrative text is represented in first order logic; average 12.6 literals– “Bill went to the store. He paid for some milk.”
instance_going(Go1) goer(Go1,Bill) destination_go(Store1) instance_paying(Pay1) payer(Pay1,Bill) thing_paid(Pay1,Milk1)
• Assumptions explaining the above actions instance_shopping(S1) shopper(S1,Bill) go_step(S1,Go1) pay_step(S1,Pay1) thing_shopped_for(S1,Milk)
30
Experiments: Methodology
• Our algorithm automatically adds clauses to the knowledge-base for performing abduction using MLNs
• We found that 25 development examples were too few to learn weights for MLNs, we heuristically set the weights– Small negative weights on unit clauses so that they are
not assumed for no reason– Medium weights on reverse implication clauses– Large weights on mutual exlcusivity clauses
• Given a set of observations, we use Alchemy’s probabilistic inference to determine the most likely truth assignment for the remaining literals
31
Experiments: Methodology contd.
• We compare with the ACCEL system [Ng & Mooney
1992], a purely logic-based system for abduction • Selects the best explanation using a metric
– Simplicity metric: selects the explanation of smallest size
– Coherence metric: selects the explanation that maximally connects the observations (specifically geared towards this task)
• “John took the bus. He bought milk.” => John took the bus to the store where he bought the milk.
32
Experiments: Methodology contd.
• Besides finding the assumptions, a deductive system like MLN also finds other facts that can be deduced from the assumptions
• We deductively expand ACCEL’s output and gold-standard answers for a fair comparison
• We measure – Precision: what fraction of the predicted ground literals
are in the gold-standard answers
– Recall: what fraction of the ground literals in the gold-standard answers were predicted
– F-measure: harmonic mean of precision and recall
33
Experiments: Results
70
75
80
85
90
95
100
F-measure Recall Precision
MLNACCEL-SimplicityACCEL-Coherence
Development Set
34
Experiments: Results contd.
70
75
80
85
90
95
100
F-measure Recall Precision
MLNACCEL-SimplicityACCEL-Coherence
Test Set
35
Experiments: Results contd.
• MLN performs better than ACCEL-simplicity particularly on the development set
• ACCEL-coherence performs the best, but was specifically tailored for narrative understanding task
• The dataset used does not require full probabilistic treatment because little uncertainty in the knowledge-base or observations
• MLNs did not need any heuristic metric but simply found the most probable explanation
36
Future Work
• Evaluate probabilistic abduction using MLNs on a task in which uncertainty plays a bigger role
• Evaluate on a larger dataset on which the weights could be learned to automatically adapt to a particular domain– Previous abductive systems like ACCEL have no
learning mechanism
• Perform probabilistic abduction using other frameworks of combining first-order logic and graphical models [Getoor & Taskar 2007], for example, Bayesian Logic Programming [Kersting & De Raedt
2001] and compare with the presented approach
37
Conclusions
• A general method for probabilistic first-order logical abduction using MLNs
• Existing off-the-shelf deductive inference system of MLNs is employed to do abduction by suitably reversing the implications
• Handles uncertainties using probabilties and an unbounded number of related entities using first-order logic, capable of learning
• Experiments on a small plan recognition dataset demonstrated that it compares favorably with special-purpose logic-based abductive systems
38
Thanks!
Questions?