Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
Learning to Detect A Salient Object
Reporter: 鄭綱 (3/2)
Outline Introduction
CRF Formulation of Static Salient Object
Strong Contrast Center-Surround Histogram Color Spatial-Distribution Learning & Inference for the Model
Formulation of Dynamic Salient Object Results Conclusion References
Introduction
Image Labeling ProblemSky
Building
Lawn
Plane
Tree
Introduction
Which kinds of information can be used for labeling?• Features from individual sites
Intensity, color, texture, …• Interactions with neighboring sites
Contextual information
Vegetation
Sky or Building?
Introduction
Contextual information: 2 types of interactions
• Interaction with neighboring labels (Spatial smoothness of labels)• neighboring sites tend to have similar labels(except at the discontinuities)
• Interactions with neighboring observed data
Building
Sky
Sky
Introduction
Let be the label of the node of the image set S, and Ni be the neighboring nodes of node i.
Three kinds of information for image labeling Features from local node Interaction with neighboring labels Interaction with neighboring observed data
node i S-{i} Ni
ilthi
)( ilInfo
),(iNi llInfo
),(iNi xlInfo
Introduction
General formulation:
}),,(),(exp{1
)|('
''
Si Ni
iiiiSi
ii
i
XllIXlAZ
XLP
where and are called association potential and interaction potential.
iA 'iiI
Introduction
Labels in Spatial Data are NOT independent!
– spatially adjacent labels are often the same (Markov Random Fields and Conditional Random Fields)
– spatially adjacent elements that have similar features often receive the same label (Conditional Random Fields)
– spatially adjacent elements that have different features may not have correlated labels (Conditional Random Fields)
Salient Object? Salience Map?
Formulation of Static Salient Object Detection
CRF model (static image):
'1 , '
( | ) ( , ) ( , , )K
k k x x xx k x x
E A I F a I S a a I
1( | ) exp( ( | ))P A I E A I
Z
Salient object feature Pairwise feature
Strong contrast
Center-surround histogram
Color spatial-distribution
Maximize!!
Minimize!!
( , )k xF a I ( , )kf x I
1 ( , )kf x I
0xa
1xa
Strong Contrast
Generate contrast map for each level of 6-level Gaussian pyramid. Then do linear combination.
2
1 ' ( )
( , ) ( ) ( ')L
l lc
l x N x
f x I I x I x
Input image Level 1 Level 4
Center-Surround Histogram
Salient object usually has a “huge” difference from local area.
…….
More different between 2 rectangles !
Center-Surround Histogram
2 * *'
{ '| *( ')}
( , ) ( ( '), ( '))h xx sx x R x
f x I w R x R x
where
22 ( )1( , )
2
i is
s i is
R RR R
R R
22' 'exp( 0.5 ' )xx xw x x
…...
??
Color Spatial-Distribution
The wider a color is distributed in the image, the less possible a salient object contains this color.
Each pixel is assigned to a color component with the probability:
Color Spatial-Distribution
Then the feature can be defined as a weighted sum:
( , ) ( | ) (1 ( ))s xc
f x I p c I V c
Formulation of Static Salient Object Detection
CRF model (static image):
'1 , '
( | ) ( , ) ( , , )K
k k x x xx k x x
E A I F a I S a a I
1( | ) exp( ( | ))P A I E A I
Z
Salient object feature Pairwise feature
Strong contrast
Center-surround histogram
Color spatial-distribution
Maximize!!
Minimize!!
Learning & Inference for the Model
The goal of CRF learning is to estimate the linear weights .{ }k
* argmax log ( | ; )n n
n
P A I
'1 , '
( | ) ( , ) ( , , )K
k k x x xx k x x
E A I F a I S a a I
1{ , }n n NnI A
Gradient descent { }k
Training images
Learning & Inference for the Model
log ( | ; ) log ( , )( , )
n n nn n
k xxk k
d P A I d Z IF a I
d d
1
( , ) (exp( ( | ; ))( , )
n n n nk x n
x A k
F a I E A IZ I
""""""""""""""""""""""""""""
1( , ) [exp( ( | ; ))] ( , )
( , )n n n n n n
k x k xnx A x
F a I E A I F a IZ I
""""""""""""""
""""""""""""""
( , ) ( , ) ( | ; )n n n n n nk x k x
x A x
F a I F a I p A I
""""""""""""""
A training image for example:
where
( , ) exp( ( | ; ))n
A
Z I E A I """""""""""""""""""""""""""" An training image
Label inferred this iterate
Possible label
:nI
:nA
:nA Label per pixel:a
Learning & Inference for the Model
1
,
( ( , ) ( | ) ( , ))t t n n n n n nk k k x k x
A x x
F a I p A I F a I
Gradient descent:
How to use ground-truth information?
1
,
( ( , ) ( | ) ( , ) ( | ))t t n n n n n n n nk k k x k x x x
A x x
F a I p A I F a I p a g
( | )n nx xp a g
1 nxg
nxg
0xa
1xa
Where is the labeled ground-truth.
nxg
t: iteration tht
Learning & Inference for the Model
Situations:
0, 0n nx xa g
0, 1n nx xa g
1, 0n nx xa g
1, 1n nx xa g
( | ) 1n nx xp a g
( | ) 1n nx xp a g
( | ) 0n nx xp a g
( | ) 0n nx xp a g
Ground-truth mistake!!!
Ground-Truth Mistake
Solution: apply Gaussian function to weight every pixel in the rectangle.
( | )n nx xp a g
1 ( )nxg G x
( )nxg G x
0xa
1xa
( )G x
( | )n nx xp a g
1 nxg
nxg
0xa
1xa
Inference We should find the most probable labeling to maximize
in training & detection. BP – Max-product Belief Propagation [Pearl ‘86]
+ Can be applied to any energy function– In vision results are usually worse than that of graph cuts– Does not always converge
TRW - Max-product Tree-reweighted Message Passing [Wainwright, Jaakkola, Willsky ‘02] , [Kolmogorov ‘05]+ Can be applied to any energy function+ Convergence guarantees for the algorithm in [Kolmogorov
’05]
( | )p A I
Formulation of Dynamic Salient Object Detection
Similar as static salient object detection!
1 1
1( | , ) exp( ( | , ))t t t t t tP A I I E A I I
Z
1 0 1 '1 1 , '
( | , ) ( ( , ) ( , ) ( , , )) ( , , )K K L
t t t k k x t k k x t x t t x x tx k k K x x
E A I I F a I F a M F a I I S a a I
Static Salient feature
Maximize!!
Contrast of motion
Center-surround histogram
Spatial-distribution
of motion
Penalty term of motion
Results
From left to right: input image, multi-scale contrast, center-surround histogram, color spatial distribution, and binary mask by CRF.
Results
From left to right: Fuzzy growing based methodSalience mapTheir approachGround-truth
Results
1. multi-scale contrast 2. center-surround histogram 3. color spatial distribution 4. combination all
Image set A Image set B
1. FG 2. SM 3. their approach
Conclusion
They model the salient object detection by CRF, where a group of salient features are combined through CRF learning.
It’s a set of novel local, regional & global salient features to define a generic salient object.
Multi-object & no object cases are left as future work.
References (paper & book)
“Learning to Detect A Salient Object”, CVPR 2007, PAMI 2010.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.4387&rep=rep1&type=pdf
“A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, PAMI, 1998.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.2199&rep=rep1&type=pdf
“Pattern Recognition and Machine Learning”, C. M. Bishop.
http://www.library.wisc.edu/selectedtocs/bg0137.pdf
References (about CRF)
“Discriminative Random Fields”, IJCV, 2006. http://
citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.4695&rep=rep1&type=pdf
“Conditional Random Fields: An Introduction”, H. M. Wallach. http://citeseerx.ist.psu.edu/viewdoc/downloa
d?doi=10.1.1.64.436&rep=rep1&type=pdf “Log-linear models & conditional random
fields”, C. Elkan, 2008. http://cseweb.ucsd.edu/~elkan/250B/cikmtut
orial.pdf
References (about TRW) “MAP estimation via agreement on (hyper) trees:
Message-passing and linear programming approaches”, IEEE transaction on Information Theory, 2005.
http://arxiv.org/pdf/cs/0508070 “Convergent Tree-Reweighted Message Passing for
Energy Minimization”, PAMI, 2006. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.
100.2409&rep=rep1&type=pdf “A Comparative Study of Energy Minimization Methods
for Markov Random Fields with Smoothness-Based Priors”, PAMI, 2008.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142.4997&rep=rep1&type=pdf