Non-sampling functional approximation of linear and non-linear Bayesian Update
-
Upload
alexander-litvinenko -
Category
Science
-
view
16 -
download
1
Transcript of Non-sampling functional approximation of linear and non-linear Bayesian Update
Non-sampling functional approximation of linear andnon-linear Bayesian Update
Alexander Litvinenko
H. G. Matthies
Institute of Scientific Computing, TU Braunschweig
Brunswick, Germany
http://www.wire.tu-bs.de
2
Lorenz-63 Problem
Lorenz 1963 is a system of ODEs. Has chaotic solutions for certainparameter values and initial conditions. [LBU is done by O. Pajonk].
x = σ(y − x)y = x(ρ− z)− yz = xy − βz
Initial state q0(ω) = (x0(ω), y0(ω), z0(ω)) are uncertain.
Solving in t0, t1, ..., t10, Noisy Measur. → UPDATE, solving in t11, t12, ..., t20, NoisyMeasur. → UPDATE,...
IDEA of the Bayesian Update (BU):Take qf(ω) = q0(ω).Linear BU: qa = qf +K · (z − y)Non-Linear BU: qa = qf + H1 · (z − y) + (z − y)T · H2 · (z − y).
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
3
Setting for the identification process
General idea:We observe / measure a system, whose structure we know in principle.
The system behaviour depends on some quantities (parameters),which we do not know ⇒ uncertainty.
We model (uncertainty in) our knowledge in a Bayesian setting:as a probability distribution on the parameters.
We start with what we know a priori, then perform a measurement.This gives new information, to update our knowledge (identification).
Update in probabilistic setting works with conditional probabilities⇒ Bayes’s theorem.
Repeated measurements lead to better identification.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
4
Mathematical setup
Consider
A(u; q) = f ⇒ u = S(f ; q),
where S is solution operator.
Operator depends on parameters q ∈ Q,hence state u ∈ U is also function of q:
Measurement operator Y with values in Y:
y = Y (q;u) = Y (q, S(f ; q)).
Examples of measurements:(ODE) u(t) = (x(t), y(t), z(t))T , y(t) = (x(t), y(t))T
(PDE) y(ω) =∫D0u(ω, x)dx, y(ω) =
∫D0| gradu(ω, x)|2dx, u in few points
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
5
Inverse problem
For given f , measurement y is just a function of q.This function is usually not invertible ⇒ ill-posed problem,
measurement y does not contain enough information.
In Bayesian framework state of knowledge modelled in a probabilistic way,parameters q are uncertain, and assumed as random.
Bayesian setting allows updating / sharpening of informationabout q when measurement is performed.
The problem of updating distribution—state of knowledge of qbecomes well-posed.
Can be applied successively, each new measurement y andforcing f —may also be uncertain—will provide new information.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
6
Conditional probability and expectation
With state u ∈ U = U ⊗ S a RV, the quantity to be measured
y(ω) = Y (q(ω), u(ω))) ∈ Y := Y ⊗ S
is also uncertain, a random variable.A new measurement z is performed, composed from the
“true” value y ∈ Y and a random error ε: z(ω) = y + ε(ω) ∈ Y .
Classically, Bayes’s theorem gives conditional probability
P(Iq|Mz) =P(Mz|Iq)P(Mz)
P(Iq);
expectation with this posterior measure is conditional expectation. Kolmogorov startsfrom conditional expectation E (·|Mz),
from this conditional probability via P(Iq|Mz) = E(χIq|Mz
).
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
7
Conditional expectation
The conditional expectation is defined asorthogonal projection onto the closed subspace L2(Ω,P, σ(z)):
E(q|σ(z)) := PQ∞q = argminq∈L2(Ω,P,σ(z)) ‖q − q‖2L2
The subspace Q∞ := L2(Ω,P, σ(z)) represents the available information, estimateminimises Φ(·) := ‖q − (·)‖2 over Q∞.
The update, also called the assimilated value qa(ω) := PQ∞q = E(q|σ(z)), is aQ-valued RV
and represents new state of knowledge after the measurement.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
8
Approximation
Minimisation equivalent to orthogonality: find φ ∈ L0(Y,Q)
∀v ∈ Q∞ : 〈〈DqaΦ(qa(φ)), v〉〉L2 = 〈〈q − qa, v〉〉L2 = 0,
⇔ DφΦ := DqaΦ Dφqa = 0 with qa(φ) := φ(z).
Approximation of Q∞: take Qn ⊂ Q∞
Qn := ϕ ∈ Q : ϕ = ψn Y, ψn a nth degree polynomial
i.e. ϕ = H0 + H1 Y + · · ·+ Hk Y ⊗k + · · ·+ Hn Y ⊗n,where Hk ∈ L k
s (Y,Q) is symmetric and k-linear.
With qa(φ) = qa(( H0 , . . . , Hk , . . . , Hn )) =
∑nk=0 Hk z⊗k = PQnq,
orthogonality implies
∀` = 0, . . . , n : D( H` )Φ(qa( H0 , . . . , Hk , . . . , Hn )) = 0
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
9
Approximation
Φ(qa) = ‖q − qa‖2 = ‖q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n)‖2 =
〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n)〉.
D( H0 )Φ(qa(...)) : 〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), 1〉 = 0.
H0 + H1 〈Y 〉+ · · ·+ Hn 〈Y ⊗n〉 = 〈q〉.D( H1 )Φ(qa(...)) : 〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), Y 〉 = 0.
H0 〈Y 〉+ H1 〈Y, Y 〉+ · · · Hn 〈Y ⊗n+1〉 = 〈q, Y 〉.D( Hn )Φ(qa(...)) : 〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), Y ⊗n〉 = 0.
H0 〈Y ⊗n〉+ H1 〈Y ⊗n+1〉+ · · · Hn 〈Y ⊗2n〉 = 〈q, Y ⊗n〉.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
10
Bayesian update in components
For finite dimensional spaces, or after discretisation,in components:
let q = (qm), Y = (Y ), and Hk = ( Hk m1...k
), then:
∀` = 0, . . . , n;
〈Y 1 · · ·Y `〉 ( H0 m) + · · ·+ 〈Y 1 · · ·Y `+k〉 ( Hk m`+1...`+k
)+
· · ·+ 〈Y 1 · · ·Y `+n〉 ( Hn m`+1...`+n
) = 〈qmY 1 · · ·Y `〉.
matrix does not depend on m—is identically block diagonal.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
11
Ingredients of the quadratic Bayessian Update
H0 + H1 〈Y 〉+ H2 〈Y ⊗2〉 = 〈q〉,
H0 〈Y 〉 + H1 〈Y ⊗2〉+ H2 〈Y ⊗3〉 = 〈q ⊗ Y 〉
H0 〈Y ⊗2〉+ H1 〈Y ⊗3〉+ H2 〈Y ⊗4〉= 〈q ⊗ Y ⊗2〉
where Y = (x(ω), y(ω), z(ω))T expanded in PCE, i.e. Y ∈ R3×|J |,〈Y ⊗2〉 − 〈Y 〉2 is the covariance matrix,〈Y ⊗4〉 is determ. tensor of size 3× 3× 3× 3. 1 〈Y 〉 〈Y ⊗2〉
〈Y 〉 〈Y ⊗2〉 〈Y ⊗3〉〈Y ⊗2〉 〈Y ⊗3〉 〈Y ⊗4〉
H0
H1
H2
=
〈q〉〈q ⊗ Y 〉〈q ⊗ Y ⊗2〉
the linear system is symmetric positive definite.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
12
Numerical solution of the system
Lorenz63: q = (x, y, z), Y = (x, y, z) + ε.
For x (for y, z the same) the system above is (1 + 4+ 9)× (1 + 4+ 9) = 13× 13 large.But there are linear dependences (e. g. H2 12 = H2 21)!If remove linear dependent rows and columns, obtain 10× 10 system.Solution is ( H0 i, H1 i
1, H1 i
2, H1 i
3, H2 i
11, ..., H2 i
33), i = 1, 2, 3.
If ε = 0, then q = Y and system can be solved analytically 1 〈q〉 〈q⊗2〉〈q〉 〈q⊗2〉 〈q⊗3〉〈q⊗2〉 〈q⊗3〉 〈q⊗4〉
H0
H1
H2
=
〈q〉〈q ⊗ q〉〈q ⊗ q⊗2〉
And H0 = 0, H1 = I, H2 = 0.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
13
Polynomial Chaos Expansion
Let Y (x,θ), θ = (θ1, ..., θM , ...), is approximated:
Y (x,θ) =∑
β∈Jm,p
Hβ(θ)Yβ(x),
Yβ(x) =1
β!
∫Θ
Hβ(θ)Y (x,θ)P(dθ).
In Lorenz63 model matrix of PCE coefficients is Y ∈ R3×|Jm,p|, |Jm,p| = (m+p)!m!p! .
E.g. multiindex for Y ⊗ Y will be Jm,2p and for Y ⊗4 will be Jm,4p.
If (Y − Z) is needed, for uncorrelated Y (Jm1,p) and Z (Jm2,p), then must buildJm1+m2,p.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
14
MAIN RESULT 1: Special cases
For n = 0 (constant functions) ⇒ qa = H0 = 〈q〉 (= E (q)).
For n = 1 the approximation is with affine functions
H0 + H1 〈z〉 =〈q〉
H0 〈z〉+ H1 〈z ⊗ z〉=〈q ⊗ z〉
=⇒ (remember that [covqz] = 〈q ⊗ z〉 − 〈q〉 ⊗ 〈z〉 )
H0 = 〈q〉 − H1 〈z〉
H1 (〈z ⊗ z〉 − 〈z〉 ⊗ 〈z〉) =〈q ⊗ z〉 − 〈q〉 ⊗ 〈z〉
⇒ H1 = [covqz][covzz]−1 (Kalman’s solution),
H0 = 〈q〉 − [covqz][covzz]−1〈z〉,
and finally
qa = H0 + H1 z = 〈q〉+ [covqz][covzz]−1(z − 〈z〉).
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
15
Measurement error
Let ε ∈ N(0, σ2I) be the Gaussian noise.Since RVs Y and ε are uncorrelated, obtain
〈Y + ε〉 = 〈Y 〉+ 〈ε〉 = 〈Y 〉
〈(Y + ε)⊗2〉 = 〈Y ⊗2〉+ 2〈Y 〉〈ε〉+ 〈ε⊗2〉 = 〈Y ⊗2〉+ σ2I
〈(Y + ε)⊗3〉 = 〈Y ⊗3〉+ 3〈Y 〉〈ε⊗2〉
〈(Y + ε)⊗4〉 = 〈Y ⊗4〉+ 6〈Y ⊗2〉〈ε⊗2〉+ 〈ε⊗4〉
(since 〈ε⊗(2k+1)〉 = 0).
One should be accurate in computing the RHS 〈q ⊗ Y 〉, 〈q ⊗ Y ⊗2〉, ... since RVs q andY belong to different overlapping spaces.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
16
MAIN RESULT 2: Final Step of Non-linear BU
After H0 , H1 , H2 are computed, evaluate
qa = H0 + H1 (z − Y ) + H2 (z − Y )⊗2,
where z = z(ω) = y(ω) + ε(ω), Y are the noisy measurements.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
17
Numerics
Abbildung 1: One update in the linearized Lorenz63.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
18
Abbildung 2: Linear (left) and non-linear (right) Bayesian updates in
Lorenz63.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
19
Conclusion about NLBU
• + Is functional, sampling free approximation.
• + Give a hope to be used for solving inverse problems.
• - Stochastic dimension grows up very fast.
• - Is very “expensive” in computing and requires efficient tensor arithmetics.
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing
20
Literature
1. O. Pajonk, B. V. Rosic, A. Litvinenko, and H. G. Matthies, A Deterministic Filterfor Non-Gaussian Bayesian Estimation, Physica D: Nonlinear Phenomena, Vol.241(7), pp. 775-788, 2012.
2. B. V. Rosic, A. Litvinenko, O. Pajonk and H. G. Matthies, Sampling Free LinearBayesian Update of Polynomial Chaos Representations, J. of Comput. Physics,Vol. 231(17), 2012 , pp 5761-5787
TU Braunschweig Institute of Scientific Computing
CCScien
tifi omputing