Post on 24-Feb-2016
description
Digital Camera and Computer Vision LaboratoryDepartment of Computer Science and Information Engineering
National Taiwan University, Taipei, Taiwan, R.O.C.
Computer VisionChapter 4
Statistical Pattern Recognition
DC & CV Lab.CSIE NTU
Introduction
Units: Image regions and projected segments Each unit has an associated measurement
vector Using decision rule to assign unit to class or
category optimally
DC & CV Lab.CSIE NTU
Introduction (Cont.)
Feature selection and extraction techniques Decision rule construction techniques Techniques for estimating decision rule error
DC & CV Lab.CSIE NTU
Simple Pattern Discrimination
Also called pattern identification process A unit is observed or measured A category assignment is made that names
or classifies the unit as a type of object The category assignment is made only on
observed measurement (pattern)
DC & CV Lab.CSIE NTU
Simple Pattern Discrimination (cont.)
a: assigned category from a set of categories
C t: true category identification from C d: observed measurement from a set of
measurements D (t, a, d): event of classifying the observed unit P(t, a, d): probability of the event (t, a, d)
DC & CV Lab.CSIE NTU
e(t, a): economic gain/utility with true category t and assigned category a
A mechanism to evaluate a decision rule Identity gain matrix
Economic Gain Matrix
DC & CV Lab.CSIE NTU
An Instance
DC & CV Lab.CSIE NTU
Another InstanceP(g, g): probability of true good, assigned good,P(g, b): probability of true good, assigned bad,...e(g, g): economic consequence for event (g, g),…e positive: profit consequencee negative: loss consequence
DC & CV Lab.CSIE NTU
Another Instance (cont.)
DC & CV Lab.CSIE NTU
Another Instance (cont.)
DC & CV Lab.CSIE NTU
Another Instance (cont.)
Fraction of good objects manufactured P(g) = P(g, g) + P(g, b) Fraction of bad objects manufactured P(b) = P(b, g) + P(b, b) Expected profit per object E =
DC & CV Lab.CSIE NTU
Conditional Probability
P(b|g): false-alarm rateP(g|b): misdetection rate
DC & CV Lab.CSIE NTU
Conditional Probability (cont.) Another formula for expected profit per objectE =
= P(g|g)P(g)e(g,g)+P(b|g)P(g)e(g,b) + P(g|b)P(b)e(b,g)+P(b|b)P(b)e(b,b)
DC & CV Lab.CSIE NTU
Example 4.1
P(g) = 0.95, P(b) = 0.05
DC & CV Lab.CSIE NTU
Example 4.1 (cont.)
DC & CV Lab.CSIE NTU
Example 4.2
P(g) = 0.95, P(b) = 0.05
DC & CV Lab.CSIE NTU
Example 4.2 (cont.)
DC & CV Lab.CSIE NTU
Decision Rule Construction
(t, a): summing (t, a, d) on every measurements d
Therefore,
Average economic gain
DC & CV Lab.CSIE NTU
Decision Rule Construction (cont.)
DC & CV Lab.CSIE NTU
Decision Rule Construction (cont.)
We can use identity matrix as the economic gain matrix to compute the probability of correct assignment:
DC & CV Lab.CSIE NTU
Fair Game Assumption
Decision rule uses only measurement data in assignment; the nature and the decision rule are not in collusion
In other words, P(a| t, d) = P(a| d)
DC & CV Lab.CSIE NTU
Fair Game Assumption (cont.)
From the definition of conditional probability
DC & CV Lab.CSIE NTU
P(t, a, d) = P(a| t, d)*P(t,d) //By conditional probability
= P(a| d)*P(t,d) //By fair game assumption
By definition, =
=
Fair Game Assumption (cont.)
DC & CV Lab.CSIE NTU
Deterministic Decision Rule We use the notation f(a|d) to completely define a
decision rule; f(a|d) presents all the conditional probability associated with the decision rule
A deterministic decision rule:
Decision rules which are not deterministic are called probabilistic/nondeterministic/stochastic
DC & CV Lab.CSIE NTU
Previous formula
By // By conditional
probability and //By p.23
=>
Expected Value on f(a|d)
DC & CV Lab.CSIE NTU
Expected Value on f(a|d) (cont.)
DC & CV Lab.CSIE NTU
Bayes Decision Rules Maximize expected economic gain Satisfy
Constructing f
DC & CV Lab.CSIE NTU
Bayes Decision Rules (cont.)
DC & CV Lab.CSIE NTU
Bayes Decision Rules (cont.)
+
+
DC & CV Lab.CSIE NTU
Continuous Measurement
For the same example, try the continuous density function of the measurements:
and Measurement lie in the close interval [0,1] Prove that they are indeed density function
DC & CV Lab.CSIE NTU
Continuous Measurement (cont.)
Suppose that the prior probability of is and the prior probability of is
= When , a Bayes decision rule
will assign an observed unit to t1, which implies
=>
DC & CV Lab.CSIE NTU
Continuous Measurement (cont.)
.805 > .68, the continuous measurement has larger expected economic gain than discrete
DC & CV Lab.CSIE NTU
Prior Probability
The Bayes rule:
Replace with The Bayes rule can be determined by
assigning any categories that maximizes
DC & CV Lab.CSIE NTU
Economic Gain Matrix
Identity matrix
Incorrect loses 1
A more balanced instance
Economic Gain Matrix
Suppose are two different economic gain matrix with relationship
According to the construction rule. Given a measurement d,
Because We then got
DC & CV Lab.CSIE NTU
DC & CV Lab.CSIE NTU
Maximin Decision Rule
Maximizes average gain over worst prior probability
DC & CV Lab.CSIE NTU
Example 4.3
DC & CV Lab.CSIE NTU
Example 4.3 (cont.)
DC & CV Lab.CSIE NTU
Example 4.3 (cont.)
DC & CV Lab.CSIE NTU
Example 4.3 (cont.)
The lowest Bayes gain is achieved when
The lowest gain is 0.6714
DC & CV Lab.CSIE NTU
Example 4.3 (cont.)
DC & CV Lab.CSIE NTU
Example 4.4
DC & CV Lab.CSIE NTU
Example 4.4 (cont.)
DC & CV Lab.CSIE NTU
Example 4.4 (cont.)
DC & CV Lab.CSIE NTU
Example 4.4 (cont.)
DC & CV Lab.CSIE NTU
Example 4.5
DC & CV Lab.CSIE NTU
Example 4.5 (cont.)
Example 4.5 (cont.)
f1 and f4 forms the lowest Bayes gain
Find some p that eliminate P(c1)p = 0.3103
DC & CV Lab.CSIE NTU
DC & CV Lab.CSIE NTU
Example 4.5 (cont.)
DC & CV Lab.CSIE NTU
Decision Rule Error
The misidentification errorαk
The false-identification error βk
DC & CV Lab.CSIE NTU
An Instance
DC & CV Lab.CSIE NTU
Reserving Judgment
The decision rule may withhold judgment for some measurements
Then, the decision rule is characterized by the fraction of time it withhold judgment and the error rate for those measurement it does assign.
It is an important technique to control error rate.
Reserving Judgment Let be the maximum Type I error we can
tolerate with category k Let be the maximum Type II error we
can tolerate with category k Measurement that will not be rejected
(acceptance region)
DC & CV Lab.CSIE NTU
DC & CV Lab.CSIE NTU
Nearest Neighbor Rule Assign pattern x to the closest vector in the
training set The definition of “closest”:
where is a metric or measurement space Chief difficulty: brute-force nearest neighbor
algorithm computational complexity proportional to number of patterns in training set
DC & CV Lab.CSIE NTU
Binary Decision Tree Classifier
Assign by hierarchical decision procedure
DC & CV Lab.CSIE NTU
Major Problems
Choosing tree structure Choosing features used at each non-terminal
node Choosing decision rule at each non-terminal
node
DC & CV Lab.CSIE NTU
Decision Rules at the Non-terminal Node
Thresholding the measurement component Fisher’s linear decision rule Bayes quadratic decision rule Bayes linear decision rule Linear decision rule from the first principal
component
Thresholding the measurement component
Measurement component Threshold Find maximum purity
Repeat for all possible measurement component
DC & CV Lab.CSIE NTU
Fisher’s linear decision rule
Discriminant function
Satisfy the maximum Fisher discriminant ratio
DC & CV Lab.CSIE NTU
Fisher’s linear decision rule
DC & CV Lab.CSIE NTU
Bayes quadratic decision rule & Bayes linear decision rule
Bayes quadratic decision rule
Bayes linear decision rule
DC & CV Lab.CSIE NTU
Scatterplot (EX1-1.STA 2v*15c)
X1
X2
46
52
58
64
70
76
152 156 160 164 168 172 176 180
Linear decision rule from the first principal component
principal component analysis first principal component
DC & CV Lab.CSIE NTU
DC & CV Lab.CSIE NTU
Error Estimation
An important way to characterize the performance of a decision rule
Training data set: must be independent of testing data set
Hold-out method: a common technique construct the decision rule with half the data
set, and test with the other half
DC & CV Lab.CSIE NTU
Neural Network
A set of units each of which takes a linear combination of values from either an input vector or the output of other units
DC & CV Lab.CSIE NTU
Neural Network (cont.)
Has a training algorithm Responses observed Reinforcement algorithms Back propagation to change weights
DC & CV Lab.CSIE NTU
Summary
Bayesian approach Maximin decision rule Misidentification and false-alarm error rates Nearest neighbor rule Construction of decision trees Estimation of decision rules error Neural network