Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)

Learning to Detect A Salient Object

Reporter: 鄭綱 (3/2)

Outline Introduction

CRF Formulation of Static Salient Object

Strong Contrast Center-Surround Histogram Color Spatial-Distribution Learning & Inference for the Model

Formulation of Dynamic Salient Object Results Conclusion References

Introduction

Image Labeling ProblemSky

Building

Lawn

Plane

Tree

Introduction

Which kinds of information can be used for labeling?• Features from individual sites

Intensity, color, texture, …• Interactions with neighboring sites

Contextual information

Vegetation

Sky or Building?

Introduction

Contextual information: 2 types of interactions

• Interaction with neighboring labels (Spatial smoothness of labels)• neighboring sites tend to have similar labels(except at the discontinuities)

• Interactions with neighboring observed data

Building

Sky

Sky

Introduction

Let be the label of the node of the image set S, and Ni be the neighboring nodes of node i.

Three kinds of information for image labeling Features from local node Interaction with neighboring labels Interaction with neighboring observed data

node i S-{i} Ni

ilthi

)( ilInfo

),(iNi llInfo

),(iNi xlInfo

Introduction

General formulation:

}),,(),(exp{1

)|('

''

Si Ni

iiiiSi

ii

i

XllIXlAZ

XLP

where and are called association potential and interaction potential.

iA 'iiI

Introduction

Labels in Spatial Data are NOT independent!

– spatially adjacent labels are often the same (Markov Random Fields and Conditional Random Fields)

– spatially adjacent elements that have similar features often receive the same label (Conditional Random Fields)

– spatially adjacent elements that have different features may not have correlated labels (Conditional Random Fields)

Salient Object? Salience Map?

Formulation of Static Salient Object Detection

CRF model (static image):

'1 , '

( | ) ( , ) ( , , )K

k k x x xx k x x

E A I F a I S a a I

1( | ) exp( ( | ))P A I E A I

Z

Salient object feature Pairwise feature

Strong contrast

Center-surround histogram

Color spatial-distribution

Maximize!!

Minimize!!

( , )k xF a I ( , )kf x I

1 ( , )kf x I

0xa

1xa

Strong Contrast

Generate contrast map for each level of 6-level Gaussian pyramid. Then do linear combination.

2

1 ' ( )

( , ) ( ) ( ')L

l lc

l x N x

f x I I x I x

Input image Level 1 Level 4

Center-Surround Histogram

Salient object usually has a “huge” difference from local area.

…….

More different between 2 rectangles !

Center-Surround Histogram

2 * *'

{ '| *( ')}

( , ) ( ( '), ( '))h xx sx x R x

f x I w R x R x

where

22 ( )1( , )

2

i is

s i is

R RR R

R R

22' 'exp( 0.5 ' )xx xw x x

…...

??

Color Spatial-Distribution

The wider a color is distributed in the image, the less possible a salient object contains this color.

Each pixel is assigned to a color component with the probability:

Color Spatial-Distribution

Then the feature can be defined as a weighted sum:

( , ) ( | ) (1 ( ))s xc

f x I p c I V c

Formulation of Static Salient Object Detection

CRF model (static image):

'1 , '

( | ) ( , ) ( , , )K

k k x x xx k x x

E A I F a I S a a I

1( | ) exp( ( | ))P A I E A I

Z

Salient object feature Pairwise feature

Strong contrast


Color spatial-distribution

Maximize!!

Minimize!!

Learning & Inference for the Model

The goal of CRF learning is to estimate the linear weights .{ }k

* argmax log ( | ; )n n

n

P A I

'1 , '

( | ) ( , ) ( , , )K

k k x x xx k x x

E A I F a I S a a I

1{ , }n n NnI A

Gradient descent { }k

Training images


log ( | ; ) log ( , )( , )

n n nn n

k xxk k

d P A I d Z IF a I

d d

1

( , ) (exp( ( | ; ))( , )

n n n nk x n

x A k

F a I E A IZ I

""""""""""""""""""""""""""""

1( , ) [exp( ( | ; ))] ( , )

( , )n n n n n n

k x k xnx A x

F a I E A I F a IZ I

""""""""""""""

""""""""""""""

( , ) ( , ) ( | ; )n n n n n nk x k x

x A x

F a I F a I p A I

""""""""""""""

A training image for example:

where

( , ) exp( ( | ; ))n

A

Z I E A I """""""""""""""""""""""""""" An training image

Label inferred this iterate

Possible label

:nI

:nA

:nA Label per pixel:a


1

,

( ( , ) ( | ) ( , ))t t n n n n n nk k k x k x

A x x

F a I p A I F a I

Gradient descent:

How to use ground-truth information?

1

,

( ( , ) ( | ) ( , ) ( | ))t t n n n n n n n nk k k x k x x x

A x x

F a I p A I F a I p a g

( | )n nx xp a g

1 nxg

nxg

0xa

1xa

Where is the labeled ground-truth.

nxg

t: iteration tht


Situations:

0, 0n nx xa g

0, 1n nx xa g

1, 0n nx xa g

1, 1n nx xa g

( | ) 1n nx xp a g

( | ) 1n nx xp a g

( | ) 0n nx xp a g

( | ) 0n nx xp a g

Ground-truth mistake!!!

Ground-Truth Mistake

Solution: apply Gaussian function to weight every pixel in the rectangle.

( | )n nx xp a g

1 ( )nxg G x

( )nxg G x

0xa

1xa

( )G x

( | )n nx xp a g

1 nxg

nxg

0xa

1xa

Inference We should find the most probable labeling to maximize

in training & detection. BP – Max-product Belief Propagation [Pearl ‘86]

+ Can be applied to any energy function– In vision results are usually worse than that of graph cuts– Does not always converge

TRW - Max-product Tree-reweighted Message Passing [Wainwright, Jaakkola, Willsky ‘02] , [Kolmogorov ‘05]+ Can be applied to any energy function+ Convergence guarantees for the algorithm in [Kolmogorov

’05]

( | )p A I

Formulation of Dynamic Salient Object Detection

Similar as static salient object detection!

1 1

1( | , ) exp( ( | , ))t t t t t tP A I I E A I I

Z

1 0 1 '1 1 , '

( | , ) ( ( , ) ( , ) ( , , )) ( , , )K K L

t t t k k x t k k x t x t t x x tx k k K x x

E A I I F a I F a M F a I I S a a I

Static Salient feature

Maximize!!

Contrast of motion


Spatial-distribution

of motion

Penalty term of motion

Results

From left to right: input image, multi-scale contrast, center-surround histogram, color spatial distribution, and binary mask by CRF.

Results

From left to right: Fuzzy growing based methodSalience mapTheir approachGround-truth

Results

1. multi-scale contrast 2. center-surround histogram 3. color spatial distribution 4. combination all

Image set A Image set B

1. FG 2. SM 3. their approach

Conclusion

They model the salient object detection by CRF, where a group of salient features are combined through CRF learning.

It’s a set of novel local, regional & global salient features to define a generic salient object.

Multi-object & no object cases are left as future work.

References (paper & book)

“Learning to Detect A Salient Object”, CVPR 2007, PAMI 2010.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.4387&rep=rep1&type=pdf

“A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, PAMI, 1998.


“Pattern Recognition and Machine Learning”, C. M. Bishop.

http://www.library.wisc.edu/selectedtocs/bg0137.pdf







References (about CRF)

“Discriminative Random Fields”, IJCV, 2006. http://

citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.4695&rep=rep1&type=pdf

“Conditional Random Fields: An Introduction”, H. M. Wallach. http://citeseerx.ist.psu.edu/viewdoc/downloa

d?doi=10.1.1.64.436&rep=rep1&type=pdf “Log-linear models & conditional random

fields”, C. Elkan, 2008. http://cseweb.ucsd.edu/~elkan/250B/cikmtut

orial.pdf






http://cseweb.ucsd.edu/~elkan/250B/cikmtutorial.pdf

http://cseweb.ucsd.edu/~elkan/250B/cikmtutorial.pdf

References (about TRW) “MAP estimation via agreement on (hyper) trees:

Message-passing and linear programming approaches”, IEEE transaction on Information Theory, 2005.

http://arxiv.org/pdf/cs/0508070 “Convergent Tree-Reweighted Message Passing for

Energy Minimization”, PAMI, 2006. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.

100.2409&rep=rep1&type=pdf “A Comparative Study of Energy Minimization Methods

for Markov Random Fields with Smoothness-Based Priors”, PAMI, 2008.


http://arxiv.org/pdf/cs/0508070





Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)

Documents

Transcript of Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)