4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2]...
Transcript of 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2]...
![Page 2: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/2.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
▷▷ Introduction to Pattern Recognition SystemIntroduction to Pattern Recognition System
▷▷ Feature Extraction Feature Extraction 효율화효율화: : HaarHaar--like featurelike feature와와 Integral ImageIntegral Image
▷▷ Dimension Reduction: PCADimension Reduction: PCA
▷▷ Bayesian Decision TheoryBayesian Decision Theory
▷▷ Bayesian Discriminant Function for Normal DensityBayesian Discriminant Function for Normal Density
▷▷ Linear Discriminant AnalysisLinear Discriminant Analysis
▷▷ Linear Discriminant FunctionsLinear Discriminant Functions
▷▷ Support Vector MachineSupport Vector Machine
▷▷ k Nearest Neighbork Nearest Neighbor
▷▷ Statistical ClusteringStatistical Clustering
![Page 4: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/4.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Machine Perception [2]Machine Perception [2]
• Build a machine that can recognize patterns:
– Speech recognition
– Fingerprint identification
– OCR (Optical Character Recognition)
– DNA sequence identification
![Page 5: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/5.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Components of Pattern Classification System [6]Components of Pattern Classification System [6]
![Page 6: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/6.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Types of Prediction Problems [6]Types of Prediction Problems [6]
![Page 10: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/10.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Pattern Recognition Approaches [6]Pattern Recognition Approaches [6]
![Page 11: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/11.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Pattern Recognition Approaches [6]Pattern Recognition Approaches [6]
![Page 12: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/12.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Machine Perception [2]Machine Perception [2]
(Example)“Sorting incoming Fish on a conveyor according to species using optical sensing”
Sea bassSpecies
Salmon
![Page 13: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/13.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Machine Perception [2]Machine Perception [2]
Problem Analysis: set up a camera and take some sample images to extract features
Preprocessing: use a segmentation operation to isolate fishes from one another and from the background
Feature extraction: information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features
The features are passed to a classifier
![Page 14: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/14.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Feature Selection [2]Feature Selection [2]
The length of the fish as a possible feature for discrimination
The length is a poor feature alone!
![Page 15: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/15.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Feature Selection [2]Feature Selection [2]
The lightness of the fish as a possible feature for discrimination
![Page 16: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/16.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Feature Selection [2]Feature Selection [2]
• Adopt the lightness and add the width of the fish
Fish xT = [x1, x2]
Lightness Width
![Page 17: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/17.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization [2]Generalization [2]
The central aim of designing a classifier is to correctly classify novel inputnovel input.
![Page 18: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/18.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization [3]Generalization [3]
Polynomial Curve FittingPolynomial Curve Fitting
![Page 19: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/19.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization: Model Selection [3]Generalization: Model Selection [3]
Polynomial Curve FittingPolynomial Curve Fitting
0th Order Polynomial 1st Order Polynomial
3rd Order Polynomial 9th Order Polynomial
![Page 20: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/20.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization: Model Selection [3]Generalization: Model Selection [3]
Polynomial Curve FittingPolynomial Curve Fitting, Over, Over--fittingfitting
Root‐Mean‐Square (RMS) Error:
![Page 21: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/21.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization: Sample Size [3]Generalization: Sample Size [3]
Polynomial Curve FittingPolynomial Curve Fitting
9th Order Polynomial N=15
![Page 22: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/22.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization: Sample Size [3]Generalization: Sample Size [3]
Polynomial Curve FittingPolynomial Curve Fitting
9th Order Polynomial N=100
![Page 23: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/23.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Generalization: Regularization [3]Generalization: Regularization [3]
Polynomial Curve FittingPolynomial Curve Fitting
Regularization: Penalize large coefficient values
![Page 24: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/24.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Learning and Adaptation [2]Learning and Adaptation [2]
• Supervised Learning: a teacher provides a category label or cost for each
pattern in a training set, and seeks to reduce the sum of the costs for
these patterns.
• Unsupervised Learning: there is no explicit teacher, and the system forms
clusters or “natural grouping” of the input patterns.
• Reinforcement Learning: no desired category signal is given; instead, the
only teaching is that the tentative category is right or wrong.
![Page 25: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/25.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Linear Discriminant Functions [6]Linear Discriminant Functions [6]
![Page 26: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/26.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Feature Extraction Feature Extraction 효율화효율화: : HaarHaar--like featurelike feature와와
Integral ImageIntegral Image
![Page 27: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/27.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
HaarHaar--like Feature [7]like Feature [7]
The simple features used are reminiscent of Haar basis functions which have been used by Papageorgiou et al. (1998).Three kinds of features: two-rectangle feature, three-rectangle feature, and four-rectangle featureGiven that the base resolution of the detector is 24x24, the exhaustive set of rectangle feature is quite large, 160,000.
![Page 28: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/28.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
HaarHaar--like Feature: Integral Image [7]like Feature: Integral Image [7]
Rectangle features can be computed very rapidly using an intermediate representation for the image which we call the integral image.The integral image at location x,y contains the sum of the pixels above and to the left of x, y, inclusive:
where ii (x, y) is the integral image and i (x, y) is the original image (see Fig. 2). Using the following pair of recurrences:
(where s(x, y) is the cumulative row sum, s(x,−1) =0, and ii (−1, y) = 0) the integral image can be computed in one pass over the original image.
![Page 29: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/29.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
HaarHaar--like Feature: Integral Image [7]like Feature: Integral Image [7]
Using the integral image any rectangular sum can be computed in four array references (see Fig. 3).
Our hypothesis, which is borne out by experiment, is that a very small number of these features can be combined to form an effective classifier. The main challenge is to find these features.
![Page 30: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/30.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Dimension Dimension Reduction: PCAReduction: PCA
![Page 31: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/31.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Abstract [1]Abstract [1]
Principal component analysis (PCA) is a technique that is useful for the compression and classification of data. The purpose is to reduce the dimensionality of a data set (sample) by finding a new set of variables, smaller than the original set of variables, that nonetheless retains most of the sample's information.
By information we mean the variation present in the sample, given by the correlations between the original variables. The new variables, called principal components (PCs), are uncorrelated, and are ordered by the fraction of the total information each retains.
![Page 32: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/32.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Geometric Picture of Principal Components [1]Geometric Picture of Principal Components [1]
A sample of n observations in the 2-D space
GoalGoal: to account for the variation in a sample in as few variables as possible, to some accuracy
![Page 33: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/33.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Geometric Picture of Principal Components [1]Geometric Picture of Principal Components [1]
• the 1st PC is a minimum distance fit to a line in X space• the 2nd PC is a minimum distance fit to a line in the plane perpendicular to the 1st PC
PCs are a series of linear least squares fits to a sample, each orthogonal to all the previous.
![Page 34: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/34.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Usage of PCA: Data Compression [1]Usage of PCA: Data Compression [1]
Because the kth PC retains the kth greatest fraction of the variationwe can approximate each observation by truncating the sum at the first m < p PCs
![Page 35: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/35.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Usage of PCA: Data Compression [1]Usage of PCA: Data Compression [1]
Reduce the dimensionality of the data
from p to m < p by approximating
where is the n x m portion of
and is the p x m portion of
n: sample number
![Page 36: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/36.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Let X be a d-dimensional random vector expressed as column vector.
Without loss of generality, assume X has zero mean. We want to find
a orthonormal transformation matrix P such that
with the constraint that
is a diagonal matrix and
PX is a random vector with all its distinct components pairwise uncorrelated.
By substitution, and matrix algebra, we obtain:
Derivation of PCA using the Covariance Method [8]Derivation of PCA using the Covariance Method [8]
![Page 37: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/37.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
We now have:
Rewrite P as d column vectors, so
and as:
Substituting into equation above, we obtain:
Notice that in ,
Pi is an eigenvector of the covariance matrix of X. Therefore, by finding the
eigenvectors of the covariance matrix of X, we find a projection matrix P
that satisfies the original constraints.
Derivation of PCA using the Covariance Method [8]Derivation of PCA using the Covariance Method [8]
![Page 38: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/38.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Bayesian Decision Bayesian Decision TheoryTheory
![Page 39: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/39.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
State of Nature [2]State of Nature [2]
We let denote the state of nature, with = 1 for sea bass and = 2 for
salmon.
Because the state of nature is so unpredictable, we consider to be a
variable that must be described probabilistically.
P(1) = P(2) (uniform priors)
P(1) + P( 2) = 1 (exclusivity and exhaustivity)
More generally, we assume that there is some a priori probability (or simply
prior) P(1) that the next fish is sea bass, and some prior probability P(2)
that it is salmon.
P(1) + P( 2) = 1 (exclusivity and exhaustivity)
Decision rule with only the prior information
Decide 1 if P(1) > P(2) otherwise decide 2
![Page 40: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/40.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
ClassClass--Conditional Probability Density [2]Conditional Probability Density [2]
In most circumstances we are not asked to make decisions with so little
information. In our example, we might for instance use a lightness
measurement x to improve our classifier.
We consider x to be a continuous random variable whose distribution
depends on the state of nature and is expressed as p(x|). This is the
class-conditional probability density function, the probability density
function for x given that the state of nature is .
Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value x given the pattern is in category ωi.
![Page 41: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/41.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Suppose that we know both the prior probabilities P(j) and the conditional
densities p(x|j) for j=1, 2.
Suppose further that we measure the lightness of a fish and discover that
its value is x.
How does this measurement influence our attitude concerning the How does this measurement influence our attitude concerning the
true state of nature true state of nature –– that is, the category of the fish?that is, the category of the fish?
Posterior, likelihood, evidence [2]Posterior, likelihood, evidence [2]
![Page 42: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/42.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Posterior, likelihood, evidence [2]Posterior, likelihood, evidence [2]
Bayes formula:
Where in case of two categories
Then,
2j
1jjj )(P)|x(P)x(P
likelihood priorposteriorevidence
( | ) ( )( | )
( )j j
j
P x PP x
P x
2
1
( | ) ( ) ( | ) ( )( | )
( ) ( | ) ( )
j j j jj j
j jj
P x P P x PP x
P x P x P
![Page 43: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/43.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Posterior, likelihood, evidence [2]Posterior, likelihood, evidence [2]
Posterior probabilities for the particular priors P(ω1) = 2/3 and P(ω2)= 1/3for the class-conditional probability densities shown in Fig. 2.1.
Thus in this case, given that a pattern is measured to have feature value x= 14, the probability it is in category ω2 is roughly 0.08, and that it is in ω1
is 0.92.At every x, the posteriors sum to 1.0.
![Page 44: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/44.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Decision given the Posterior Probabilities [2]Decision given the Posterior Probabilities [2]
x is an observation for which:
if P(1 | x) > P(2 | x) True state of nature = 1
if P(1 | x) < P(2 | x) True state of nature = 2
Therefore:
Whenever we observe a particular x, the probability of error is :
P(error | x) = P(1 | x) if we decide 2
P(error | x) = P(2 | x) if we decide 1
Decide 1 if P(1 | x) > P(2 | x);otherwise decide 2
Therefore:P(error | x) = min [P(1 | x), P(2 | x)]
(Bayes decision)
![Page 45: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/45.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Bayesian Decision Theory : Risk Minimization [2]Bayesian Decision Theory : Risk Minimization [2]
Generalization of the preceding ideas
- Use of more than one feature
- Use more than two states of nature
-- Allowing actions and not only decide on the state of natureAllowing actions and not only decide on the state of nature
-- Introduce of a loss function which is more general than the proIntroduce of a loss function which is more general than the probability bability
of errorof error
Feature vector & feature space
Feature vector x is in a d-dimensional Euclidean space Rd, called the
feature space.
![Page 46: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/46.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Risk Minimization: Loss Function [2]Risk Minimization: Loss Function [2]
Formally, the loss function states how costly each action taken is, and is
used to convert a probability determination into a decision.
Let {1, 2,…, c} be the set of c states of nature (or “categories”)
Let {1, 2,…, a} be the set of possible actions
Let (i | j) be the loss incurred for taking action i when the state of
nature is j
Overall riskR = Sum of all R(i | x) for i = 1,…,a
Minimizing R Minimizing R(i | x) for i = 1,…, a
for i = 1,…,a
cj
1jjjii )x|(P)|()x|(R
|R R p d x x x x
Conditional risk
![Page 47: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/47.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Risk Minimization [2]Risk Minimization [2]
Two Category Classification
1 : deciding 1
2 : deciding 2
ij = (i | j)
loss incurred for deciding i when the true state of nature is j
Conditional risk:
R(1 | x) = 11P(1 | x) + 12P(2 | x)
R(2 | x) = 21P(1 | x) + 22P(2 | x)
![Page 48: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/48.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Two Category Classification
Our rule is the following:
if R(1 | x) < R(2 | x)
action 1: “decide 1” is taken
This results in the equivalent rule :
decide 1 if:
(21- 11) P(x | 1) P(1) > (12- 22) P(x | 2) P(2)
and decide 2 otherwise
Risk Minimization [2]Risk Minimization [2]
![Page 49: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/49.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Two Category Classification
Likelihood ratioLikelihood ratio:
The preceding rule is equivalent to the following rule:
Then take action 1 (decide 1)
Otherwise take action 2 (decide 2)
Optimal decision propertyOptimal decision property
“If the likelihood ratio exceeds a threshold value independent of the input
pattern x, we can take optimal actions”
)(P)(P.)|x(P
)|x(P if1
2
1121
2212
2
1
Risk Minimization [2]Risk Minimization [2]
![Page 50: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/50.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Minimum Error Rate Classification [2]Minimum Error Rate Classification [2]
Actions are decisions on classes
If action i is taken and the true state of nature is j then:
the decision is correct if i = j and in error if i j
Seek a decision rule that minimizes the probability of error which is the
error rate
Introduction of the zero-one loss function:
Therefore, the conditional risk is:
“The risk corresponding to this loss function is the average probability error”
c,...,1j,i ji 1ji 0
),( ji
1jij
cj
1jjjii
)x|(P1)x|(P
)x|(P)|()x|(R
![Page 51: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/51.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Bayesian Decision Theory : Continuous Features [2]Bayesian Decision Theory : Continuous Features [2]
Generalization of the preceding ideas
-- Use of more than one featureUse of more than one feature
-- Use more than two states of natureUse more than two states of nature
- Allowing actions and not only decide on the state of nature
- Introduce a loss of function which is more general than the
probability of error
![Page 52: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/52.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Classifier, Discriminant Functions, and Decision Surface [2]Classifier, Discriminant Functions, and Decision Surface [2]
Set of discriminant functions gi(x), i = 1,…, c
The classifier assigns a feature vector x to class i
if: gi(x) > gj(x) j i
The functional structure of a general statistical pattern classifier which includes dinputs and c discriminant functions gi (x). A subsequent step determines which of the discriminant values is the maximum, and categorizes the input pattern accordingly. The arrows show the direction of the flow of information, though frequently the arrows are omitted when the direction of flow is self-evident.
![Page 53: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/53.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The MultiThe Multi--category casecategory case
Let gi(x) = - R(i | x)(max. discriminant corresponds to min. risk!)
For the minimum error rate, we take gi(x) = P(i | x)
(max. discrimination corresponds to max. posterior!)gi(x) P(x | i) P(i)
gi(x) = ln P(x | i) + ln P(i)(ln: natural logarithm!)
Feature space divided into c decision regions
if if ggii(x(x) > ) > ggjj(x(x) ) j j i then x is in i then x is in RRii
((RRii means assign means assign xx to to ii))
Classifier, Discriminant Functions, and Decision Surface [2]Classifier, Discriminant Functions, and Decision Surface [2]
![Page 54: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/54.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The twoThe two--category casecategory case
A classifier is a “dichotomizer” that has two discriminant functions g1
and g2
Let g(x) g1(x) – g2(x)
Decide 1 if g(x) > 0 ;
Otherwise decide 2
The computation of g(x)
)()(ln
)|()|(ln
)|()|()(
2
1
2
1
21
PP
xPxP
xPxPxg
Classifier, Discriminant Functions, and Decision Surface [2]Classifier, Discriminant Functions, and Decision Surface [2]
![Page 55: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/55.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
In this two-dimensional two-category classifier, the probability densities are Gaussian, the decision boundary consists of two hyperbolas, and thus the decision region R2 is not simply connected. The ellipses mark where the density is 1/e times that at the peak of the distribution.
Classifier, Discriminant Functions, and Decision Surface [2]Classifier, Discriminant Functions, and Decision Surface [2]
The twoThe two--category casecategory case
![Page 56: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/56.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Bayesian Bayesian Discriminant Discriminant
Function for Normal Function for Normal DensityDensity
![Page 57: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/57.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The Normal Density [2]The Normal Density [2]
Univariate density
Density which is analytically tractable
Continuous density
A lot of processes are asymptotically Gaussian
Handwritten characters, speech sounds are ideal or prototype
corrupted by random process (central limit theorem)
Where:
= mean (or expected value) of x
2 = expected squared deviation or variance
,x21exp
21)x(P
2
![Page 58: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/58.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The Normal Density [2]The Normal Density [2]
A univariate normal distribution has roughly 95% of its area in the range|x − μ| ≤ 2σ, as shown. The peak of the distribution has value p(μ) = 1/ 2
![Page 59: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/59.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Multivariate density
Multivariate normal density in d dimensions is:
where:
x = (x1, x2, …, xd)t (t stands for the transpose vector form)
= (1, 2, …, d)t mean vector
= d×d covariance matrix
|| and -1 are determinant and inverse respectively
)()(
21exp
)2(1)( 1
2/12/
xxxP t
d
The Normal Density [2]The Normal Density [2]
![Page 60: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/60.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
We saw that the minimum error-rate classification can be achieved by the
discriminant function
gi(x) = ln P(x | i) + ln P(i)
Case of multivariate normal
[6]
Discriminant Function for the Normal Density [2]Discriminant Function for the Normal Density [2]
)(lnln212ln
2)()(
21)( 1
iiiit
ii Pdxxxg
11 1( ) ( ) ln ln ( )2 2
ti i i i ix x P
quadratic discriminant function
![Page 62: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/62.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Function for the Normal Density [6]Discriminant Function for the Normal Density [6]
![Page 63: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/63.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [2]Discriminant Functions for the Normal Density [2]
If the covariance matrices for two distributions are equal and proportional to the identity matrix, then the distributions are spherical in d dimensions, and the boundary is a generalized hyperplane of d −1 dimensions, perpendicular to the line separating the means.
In these one-, two-, and three-dimensional examples, we indicate p(x|ωi ) and the boundaries for the case P(ω1) = P(ω2). In the three-dimensional case, the grid plane separates R1 from R2.
![Page 64: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/64.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 65: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/65.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 66: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/66.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 67: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/67.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 68: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/68.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 69: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/69.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [2]Discriminant Functions for the Normal Density [2]
Probability densities (indicated by the surfaces in two dimensions and ellipsoidal surfaces in three dimensions) and decision regions for equal but asymmetric Gaussian distributions. The decision hyperplanes need not be perpendicular to the line connecting the means.
![Page 70: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/70.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 71: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/71.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 72: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/72.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 73: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/73.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 74: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/74.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [2]Discriminant Functions for the Normal Density [2]
Arbitrary Gaussian distributions lead to Bayes decision boundaries that are general hyperquadrics. Conversely, given any hyperquadric, one can find two Gaussiandistributions whose Bayes decision boundary is that hyperquadric. These variances are indicated by the contours of constant probability density.
![Page 75: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/75.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [2]Discriminant Functions for the Normal Density [2]
Arbitrary three-dimensional Gaussian distributions yield Bayes decision boundaries that are two-dimensional hyperquadrics. There are even degenerate cases in which the decision boundary is a line.
![Page 76: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/76.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Discriminant Functions for the Normal Density [6]Discriminant Functions for the Normal Density [6]
![Page 77: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/77.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Linear Discriminant Linear Discriminant AnalysisAnalysis
![Page 88: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/88.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Linear Discriminant Linear Discriminant FunctionsFunctions
![Page 89: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/89.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Linear Discriminant Functions [6]Linear Discriminant Functions [6]
![Page 94: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/94.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Minimum Squared Error Solution [6]Minimum Squared Error Solution [6]
![Page 95: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/95.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Minimum Squared Error Solution [6]Minimum Squared Error Solution [6]
![Page 96: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/96.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The PseudoThe Pseudo--Inverse Solution [6]Inverse Solution [6]
![Page 97: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/97.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
LeastLeast--MeanMean--Squares Solution [6]Squares Solution [6]
![Page 98: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/98.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Summary: Summary: PerceptronPerceptron vs. MSE Procedures [6]vs. MSE Procedures [6]
![Page 99: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/99.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The HoThe Ho--KashyapKashyap Procedure [6]Procedure [6]
![Page 100: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/100.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The HoThe Ho--KashyapKashyap Procedure [6]Procedure [6]
![Page 102: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/102.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Optimal Separating Optimal Separating HyperplanesHyperplanes [6][6]
![Page 103: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/103.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Optimal Separating Optimal Separating HyperplanesHyperplanes [6][6]
Distance between a plane and a point
![Page 104: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/104.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Optimal Separating Optimal Separating HyperplanesHyperplanes [6][6]
![Page 105: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/105.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Optimal Separating Optimal Separating HyperplanesHyperplanes [6][6]
![Page 106: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/106.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Consider the two-dimensional optimization problem:
We can visualize contours of f given by
g(x,y) = c
f(x,y)=d
Find x and y to maximize f(x,y) subject to a constraint (shown in red) g(x,y) = c.
Lagrange Multipliers [9]Lagrange Multipliers [9]
![Page 107: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/107.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
When f(x,y) becomes maximum on the path of g(x,y)=c, the contour line for g=c meets contour lines of f tangentially. Since the gradient of a function is perpendicular to the contour lines, this is the same as saying that the gradients of f and g are parallel.
Contour map. The red line shows the constraint g(x,y) = c. The blue lines are contours of f(x,y). The point where the red line tangentially touches a blue contour is our solution.
Lagrange Multipliers [9]Lagrange Multipliers [9]
![Page 108: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/108.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
To incorporate these conditions into one equation, we introduce an
auxiliary function
and solve
Lagrange Multipliers [9]Lagrange Multipliers [9]
![Page 110: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/110.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The The LagrangianLagrangian Dual Problem [6]Dual Problem [6]
![Page 111: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/111.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The The LagrangianLagrangian Dual Problem [6]Dual Problem [6]
![Page 112: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/112.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The The LagrangianLagrangian Dual Problem [6]Dual Problem [6]
![Page 113: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/113.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Minimize (in w, b)
Subject to (for any i=1,…n)
One could be tempted to expressed the previous problem by means of non-negative Lagrange multipliers αi as
we could find the minimum by sending all αi to ∞. Nevertheless the previous constrained problem can be expressed as
This is we look for a saddle point.
Dual Problem [10]Dual Problem [10]
![Page 122: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/122.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Implicit Mappings: An Example [6]Implicit Mappings: An Example [6]
![Page 126: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/126.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Kernel Methods [6]Kernel Methods [6]
Kernel Functions
![Page 127: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/127.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Architecture of an SVM [6]Architecture of an SVM [6]
![Page 130: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/130.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The k Nearest Neighbor Classification Rule [6]The k Nearest Neighbor Classification Rule [6]
![Page 131: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/131.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The k Nearest Neighbor Classification Rule [6]The k Nearest Neighbor Classification Rule [6]
![Page 132: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/132.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The k Nearest Neighbor Classification Rule [6]The k Nearest Neighbor Classification Rule [6]
![Page 133: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/133.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The k Nearest Neighbor Classification Rule [6]The k Nearest Neighbor Classification Rule [6]
![Page 134: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/134.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The k Nearest Neighbor Classification Rule [6]The k Nearest Neighbor Classification Rule [6]
![Page 136: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/136.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
NonNon--parametric Unsupervised Learning [6]parametric Unsupervised Learning [6]
![Page 141: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/141.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Criterion Function for Clustering [6]Criterion Function for Clustering [6]
![Page 143: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/143.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
Iterative Optimization [6]Iterative Optimization [6]
![Page 144: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/144.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The kThe k--means Algorithm [6]means Algorithm [6]
![Page 145: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/145.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
The kThe k--means Algorithm [4]means Algorithm [4]
![Page 146: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/146.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
ReferencesReferences
1. Frank Masci, “An Introduction to Principal Component Analysis,”http://web.ipac.caltech.edu/staff/fmasci/home/statistics_refs/PrincipalComponentAnalysis.pdf
2. Richard O. Duda, Peter E. Hart, David G. Stork, Pattern Classification, second edition, John Wiley & Sons, Inc., 2001.
3. Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
4. Sergios Theodoridis, Konstantinos Koutroumbas, Pattern Recognition, Academic Press, 2006.
5. Ho Gi Jung, Yun Hee Lee, Pal Joo Yoon, In Yong Hwang, and Jaihie Kim, “Sensor Fusion Based Obstacle Detection/Classification for ActivePedestrian Protection System,” Lecture Notes on Computer Science, Vol. 4292, 294-305.
6. Ricardo Gutierrez-Osuna, “Pattern Recognition, Lecture Notes,” available at http://research.cs.tamu.edu/prism/lectures.htm
7. Paul Viola, Michael Jones, “Robust real-time object detection,”International Journal of Computer Vision, 57(2), 2004, 137-154.
8. Wikipedia, “Principal component analysis,” available at
http://en.wikipedia.org/wiki/Principal_component_analysis
![Page 147: 4. Pattern Recognition - Yonsei University Pattern... · 2014-12-29 · Learning and Adaptation [2] • Supervised Learning: a teacher provides a category label or cost for each pattern](https://reader030.fdocument.pub/reader030/viewer/2022040610/5ed0fed0d3b82946a13ab93a/html5/thumbnails/147.jpg)
E-mail: [email protected]://web.yonsei.ac.kr/hgjung
ReferencesReferences
9. Wikipeida, “Lagrange multipliers,”http://en.wikipedia.org/wiki/Lagrange_multipliers.
10.Wikipeida, “Support Vector Machine,”http://en.wikipedia.org/wiki/Support_vector_machine.