Advanced Computer Vision

Advanced Computer Vision

Chapter 4Feature Detection and Matching

Presented by: 傅楸善 & 江祖榮[email protected]

mailto:[email protected]

Feature Detection and Matching (1/3)

• 4.1 Points and Patches• 4.2 Edges• 4.3 Lines

4.1 Points and Patches (1/2)

• There are two main approaches:– Find features in one image that can be accurately

tracked using a local search technique. (e.g., video sequences)

4.1 Points and Patches (1/2)

• Independently detect features in all the images under consideration and then match features based on their local appearance. (e.g., stitching panoramas, establishing correspondences)

Points and Patches (2/2)

• Three stages:– Feature detection (extraction) stage– Feature description stage– Feature matching stage or Feature Tracking stage

4.1.1 Feature Detectors

Aperture Problem

Weighted Summed Square Difference

• I0, I1: the two images being compared• u: the displacement vector• w(x): spatially varying weighting function• summation i: over all the pixels in the patch

Auto-correlation Function or Surface

• I0: the image being compared• ∆u: small variations in position • w(x): spatially varying weighting function• summation i: over all the pixels in the patch

• joke

Approximation of Auto-correlation Function (1/3)

• I0(xi+ ∆u) ≈ I0(xi) + I∇ 0(xi) ．∆ u

• ∇ I0(xi): image gradient at xi

yxyx

yxy

yxx

yxIyxIyxwH

yxIyxwG

yxIyxwF

GvHuvFuvuE

,

,

2

,

2

22

),(),(),(

),(),(

),(),(

2),(


• Auto-correlation matrix A:

• w: weighting kernel (from spatially varying weighting function)

• Ix, Iy: horizontal and vertical derivatives of Gaussians


Intensity change in shifting window: eigenvalue analysis

1, 2 – eigenvalues of A

direction of the slowest change

direction of the fastest change

Ellipse E(u,v) = const

(max)-1/2

(min)-1/2


1

2

Corner1 and 2 are large,

1 ~ 2;

E increases in all directions

1 and 2 are small;

E is almost constant in all directions

edge 1 >> 2

edge 2 >> 1

flat

Classification of image points using eigenvalues of A:


• Assume (λ0, λ1) are two eigenvalues of A and λ0

<= λ1.• Since the larger uncertainty depends on the

smaller eigenvalue, it makes sense to find maxima in the smaller eigenvalue to locate good features to track.

Other Measurements (1/2)• Quantity proposed by Triggs:

• Quantity proposed by Harris and Stephens:• Quantity proposed by Harris and Stephens:

• α = 0.05 • No square roots• It is still rotationally invariant and

downweights edge-like features when λ1 >> λ0.

Other Measurements (2/2)

• Quantity proposed by Brown, Szeliski, and Winder:

• It can be used when λ1 ≈ λ0.

Basic Feature Detection Algorithm (1/2)

• Step 1: Compute the horizontal and vertical derivatives of the image Ix and Iy by convolving the original image with derivatives of Gaussians.

• Step 2: Compute the three images (Ix2, Iy

2, IxIy) corresponding to the outer products of these gradients.

Basic Feature Detection Algorithm (2/2)

• Step 3: Convolve each of these images with a larger Gaussian (the weighting kernel).

• Step 4: Compute a scalar interest measure using one of the formulas discussed above.

• Step 5: Find local maxima above a certain threshold and report them as detected feature point locations.

• joke

Interest operator responses

Adaptive Non-maximal Suppression (ANMS) (1/2)

• Simply finding local maxima in interest function -> uneven distribution of feature points

• Detect features that are:– Local maxima– Response value is significantly (10%) greater than

all of its neighbors within a radius r

Adaptive Non-maximal Suppression (ANMS) (2/2)

Measuring Repeatability

• Which feature points should we use among the large number of detected feature points?

• Measure repeatability after applying rotations, scale changes, illumination changes, viewpoint changes, and adding noise.

Scale Invariance (1/2)

• Problem: if no good feature points in image?• Solution: multi-scale

4.1.2 Feature Descriptors

• Sum of squared difference• Normalized cross-correlation (Chapter 8)

Scale Invariant Feature Transform (SIFT)

Multi-scale Oriented Patches (MSOP)

• Simpler SIFT, without every scale DoGs• Used on image stitching, and so on• Detector: Harris corner detector• Multi-scale -> make program robust• DoG: Difference of Gaussian

Orientation Estimation

Gradient Location-orientation Histogram (GLOH)

Maximally Stable Extremal Region (MSER)

• Only work for grayscale images• Incrementally add pixels as the threshold is

changed.• Maximally stable: changing rate of area with

respect to the threshold is minimal

• joke

4.1.3 Feature Matching

• Two subjects:– select a matching strategy– devise efficient data structures and algorithms to

perform this matching

Matching Strategy (1/7)

• Simplest method: set a distance threshold and match within this threshold.


• Confusion matrix to estimate performance:

( recall) =找出的正確匹配

真實的正確匹配

( precision) =找出的正確匹配

估計的正確匹配總數

= 18/20

= 4/80

= 18/22

= (18+76)/(20+80) = 94/100

Matching Strategy (3/7)ROC: Receiver Operating Characteristic


• Other methods:– Nearest neighbor– Nearest neighbor distance ratio

• Nearest neighbor distance ratio (NNDR):

– d1, d2: nearest and second nearest distances

– DA: the target descriptor

– DB and DC: closest two neighbors of DA

Matching Strategy (7/7)• Indexing structures:

– Multi-dimensional search tree, or hashing.

d1 axis

d2 axis

4.1.4 Feature Tracking (1/3)

• To find a set of likely feature locations in a first image and to then search for their corresponding locations in subsequent images.

• Expected amount of motion and appearance deformation between adjacent frames is expected to be small.

Feature Tracking (2/3)

• Selecting good features to track is closely related to selecting good features to match.

• When searching corresponding patch, weighted summed square difference works well enough.

Feature Tracking (3/3)

• If features are being tracked over longer image sequences, their appearance can undergo larger changes.– Continuously match against the originally detected

feature– Re-sample each subsequent frame at the

matching location– Use affine motion model to measure dissimilarity

(ex: Kanade–Lucas–Tomasi (KLT) tracker)

• joke

4.2 Edges (1/2)

• What are edges in image:– The boundaries of objects– Occlusion events in 3D– Shadow boundaries or crease edges– Grouped into longer curves or contours

Edges (2/2)

4.2.1 Edge Detection (1/3)

• Edge has rapid intensity variation.

– J: local gradient vectorIts direction is perpendicular to the edge, and its magnitude is the intensity variation.

– I: original image

Edge Detection (2/3)

• Taking image derivatives makes noise larger.• Use Gaussian filter to remove noise:

– G: Gaussian filter– σ: width of Gaussian filter

Edge Detection (3/3)

• To thin such a continuous gradient image to only return isolated edges:

• Use Laplacian of Gaussian:

– S: second gradient operator• Then find the zero crossings to get the

maximum of gradient.

first derivative maximum: exactly where second derivative zero crossing

Scale Selection粗糙的尺度細緻的尺度

一階差分影像最小穩定尺度

二階差分影像最小穩定尺度

邊緣影像

Color Edge Detection

Brightness gradient

Color gradient

Texture gradient

Mixture

4.2.2 Edge Linking

• To link isolated edge into continuous contours.• If we use zero crossing to find edge, then

linking is just picking up two unlinked edgels in neighbors.

• Otherwise, comparing the orientation of adjacent edgels can be used.

• As the curve grouped by edges more smooth, the more robust object recognition will be.

Chain Representation (1/2)

• Chain code with eight direction:Chain code:N : 0NE: 1E : 2SE : 3S : 4SW: 5W : 6NW: 7

Chain Representation (2/2)

• Arc length parameterization

弧長

節點的 x 座標 , y 座標

• joke

4.3 Lines

• Man-made world is full of straight lines, so detecting and matching these lines can be useful in a variety of applications.

4.3.1 Successive Approximation

• Line simplification:– Piecewise-linear polyline– B-spline curve

4.3.2 Hough Transforms

• Original Hough transforms:

Hough Transforms Algorithm (1/2)

• Step 1: Clear the accumulator array.• Step 2: For each detected edgel at location

(x, y) and orientation θ = tan-1(ny/nx), compute the value of d = x*nx + y*ny and increment the accumulator corresponding to (θ, d).

任意一個點 P 的二維座標值 (x,y) 和通過點 P 直線 L 的法向量內積可以得到 L 和原點距離 d, 對距離 d 投票

Hough Transforms Algorithm (2/2)

• Step 3: Find the peaks in the accumulator corresponding to lines.

• Step 4: Optionally re-fit the lines to the constituent edgels.

RANSAC-based Line Detection

• Another alternative to the Hough transform is the RANdom SAmple Consensus (RANSAC) algorithm.

• RANSAC randomly chooses pairs of edgels to form a line hypothesis and then tests how many other edgels fall onto this line.

• Lines with sufficiently large numbers of matching edgels are then selected as the desired line segments.

4.3.3 Vanishing Points (1/4)

• Parallel lines in 3D have the same vanishing point.

Vanishing Points (2/4)

• An alternative to the 2D polar (θ, d) representation for lines is to use the full 3D m = line equation, projected onto the unit sphere.

• The location of vanishing point hypothesis:

– m: line equations in 3D representation


• Corresponding weight:

– li, lj: corresponding line segment lengths

• This has the desirable effect of downweighting (near-)collinear line segments and short line segments.

Advanced Computer Vision

Documents

Transcript of Advanced Computer Vision