Non-sampling functional approximation of linear and non-linear Bayesian Update

20
Non-sampling functional approximation of linear and non-linear Bayesian Update Alexander Litvinenko H. G. Matthies Institute of Scientific Computing, TU Braunschweig Brunswick, Germany [email protected] http://www.wire.tu-bs.de

Transcript of Non-sampling functional approximation of linear and non-linear Bayesian Update

Page 1: Non-sampling functional approximation of linear and non-linear Bayesian Update

Non-sampling functional approximation of linear andnon-linear Bayesian Update

Alexander Litvinenko

H. G. Matthies

Institute of Scientific Computing, TU Braunschweig

Brunswick, Germany

[email protected]

http://www.wire.tu-bs.de

Page 2: Non-sampling functional approximation of linear and non-linear Bayesian Update

2

Lorenz-63 Problem

Lorenz 1963 is a system of ODEs. Has chaotic solutions for certainparameter values and initial conditions. [LBU is done by O. Pajonk].

x = σ(y − x)y = x(ρ− z)− yz = xy − βz

Initial state q0(ω) = (x0(ω), y0(ω), z0(ω)) are uncertain.

Solving in t0, t1, ..., t10, Noisy Measur. → UPDATE, solving in t11, t12, ..., t20, NoisyMeasur. → UPDATE,...

IDEA of the Bayesian Update (BU):Take qf(ω) = q0(ω).Linear BU: qa = qf +K · (z − y)Non-Linear BU: qa = qf + H1 · (z − y) + (z − y)T · H2 · (z − y).

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 3: Non-sampling functional approximation of linear and non-linear Bayesian Update

3

Setting for the identification process

General idea:We observe / measure a system, whose structure we know in principle.

The system behaviour depends on some quantities (parameters),which we do not know ⇒ uncertainty.

We model (uncertainty in) our knowledge in a Bayesian setting:as a probability distribution on the parameters.

We start with what we know a priori, then perform a measurement.This gives new information, to update our knowledge (identification).

Update in probabilistic setting works with conditional probabilities⇒ Bayes’s theorem.

Repeated measurements lead to better identification.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 4: Non-sampling functional approximation of linear and non-linear Bayesian Update

4

Mathematical setup

Consider

A(u; q) = f ⇒ u = S(f ; q),

where S is solution operator.

Operator depends on parameters q ∈ Q,hence state u ∈ U is also function of q:

Measurement operator Y with values in Y:

y = Y (q;u) = Y (q, S(f ; q)).

Examples of measurements:(ODE) u(t) = (x(t), y(t), z(t))T , y(t) = (x(t), y(t))T

(PDE) y(ω) =∫D0u(ω, x)dx, y(ω) =

∫D0| gradu(ω, x)|2dx, u in few points

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 5: Non-sampling functional approximation of linear and non-linear Bayesian Update

5

Inverse problem

For given f , measurement y is just a function of q.This function is usually not invertible ⇒ ill-posed problem,

measurement y does not contain enough information.

In Bayesian framework state of knowledge modelled in a probabilistic way,parameters q are uncertain, and assumed as random.

Bayesian setting allows updating / sharpening of informationabout q when measurement is performed.

The problem of updating distribution—state of knowledge of qbecomes well-posed.

Can be applied successively, each new measurement y andforcing f —may also be uncertain—will provide new information.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 6: Non-sampling functional approximation of linear and non-linear Bayesian Update

6

Conditional probability and expectation

With state u ∈ U = U ⊗ S a RV, the quantity to be measured

y(ω) = Y (q(ω), u(ω))) ∈ Y := Y ⊗ S

is also uncertain, a random variable.A new measurement z is performed, composed from the

“true” value y ∈ Y and a random error ε: z(ω) = y + ε(ω) ∈ Y .

Classically, Bayes’s theorem gives conditional probability

P(Iq|Mz) =P(Mz|Iq)P(Mz)

P(Iq);

expectation with this posterior measure is conditional expectation. Kolmogorov startsfrom conditional expectation E (·|Mz),

from this conditional probability via P(Iq|Mz) = E(χIq|Mz

).

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 7: Non-sampling functional approximation of linear and non-linear Bayesian Update

7

Conditional expectation

The conditional expectation is defined asorthogonal projection onto the closed subspace L2(Ω,P, σ(z)):

E(q|σ(z)) := PQ∞q = argminq∈L2(Ω,P,σ(z)) ‖q − q‖2L2

The subspace Q∞ := L2(Ω,P, σ(z)) represents the available information, estimateminimises Φ(·) := ‖q − (·)‖2 over Q∞.

The update, also called the assimilated value qa(ω) := PQ∞q = E(q|σ(z)), is aQ-valued RV

and represents new state of knowledge after the measurement.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 8: Non-sampling functional approximation of linear and non-linear Bayesian Update

8

Approximation

Minimisation equivalent to orthogonality: find φ ∈ L0(Y,Q)

∀v ∈ Q∞ : 〈〈DqaΦ(qa(φ)), v〉〉L2 = 〈〈q − qa, v〉〉L2 = 0,

⇔ DφΦ := DqaΦ Dφqa = 0 with qa(φ) := φ(z).

Approximation of Q∞: take Qn ⊂ Q∞

Qn := ϕ ∈ Q : ϕ = ψn Y, ψn a nth degree polynomial

i.e. ϕ = H0 + H1 Y + · · ·+ Hk Y ⊗k + · · ·+ Hn Y ⊗n,where Hk ∈ L k

s (Y,Q) is symmetric and k-linear.

With qa(φ) = qa(( H0 , . . . , Hk , . . . , Hn )) =

∑nk=0 Hk z⊗k = PQnq,

orthogonality implies

∀` = 0, . . . , n : D( H` )Φ(qa( H0 , . . . , Hk , . . . , Hn )) = 0

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 9: Non-sampling functional approximation of linear and non-linear Bayesian Update

9

Approximation

Φ(qa) = ‖q − qa‖2 = ‖q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n)‖2 =

〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n)〉.

D( H0 )Φ(qa(...)) : 〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), 1〉 = 0.

H0 + H1 〈Y 〉+ · · ·+ Hn 〈Y ⊗n〉 = 〈q〉.D( H1 )Φ(qa(...)) : 〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), Y 〉 = 0.

H0 〈Y 〉+ H1 〈Y, Y 〉+ · · · Hn 〈Y ⊗n+1〉 = 〈q, Y 〉.D( Hn )Φ(qa(...)) : 〈q − ( H0 + H1 Y + · · ·+ Hn Y ⊗n), Y ⊗n〉 = 0.

H0 〈Y ⊗n〉+ H1 〈Y ⊗n+1〉+ · · · Hn 〈Y ⊗2n〉 = 〈q, Y ⊗n〉.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 10: Non-sampling functional approximation of linear and non-linear Bayesian Update

10

Bayesian update in components

For finite dimensional spaces, or after discretisation,in components:

let q = (qm), Y = (Y ), and Hk = ( Hk m1...k

), then:

∀` = 0, . . . , n;

〈Y 1 · · ·Y `〉 ( H0 m) + · · ·+ 〈Y 1 · · ·Y `+k〉 ( Hk m`+1...`+k

)+

· · ·+ 〈Y 1 · · ·Y `+n〉 ( Hn m`+1...`+n

) = 〈qmY 1 · · ·Y `〉.

matrix does not depend on m—is identically block diagonal.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 11: Non-sampling functional approximation of linear and non-linear Bayesian Update

11

Ingredients of the quadratic Bayessian Update

H0 + H1 〈Y 〉+ H2 〈Y ⊗2〉 = 〈q〉,

H0 〈Y 〉 + H1 〈Y ⊗2〉+ H2 〈Y ⊗3〉 = 〈q ⊗ Y 〉

H0 〈Y ⊗2〉+ H1 〈Y ⊗3〉+ H2 〈Y ⊗4〉= 〈q ⊗ Y ⊗2〉

where Y = (x(ω), y(ω), z(ω))T expanded in PCE, i.e. Y ∈ R3×|J |,〈Y ⊗2〉 − 〈Y 〉2 is the covariance matrix,〈Y ⊗4〉 is determ. tensor of size 3× 3× 3× 3. 1 〈Y 〉 〈Y ⊗2〉

〈Y 〉 〈Y ⊗2〉 〈Y ⊗3〉〈Y ⊗2〉 〈Y ⊗3〉 〈Y ⊗4〉

H0

H1

H2

=

〈q〉〈q ⊗ Y 〉〈q ⊗ Y ⊗2〉

the linear system is symmetric positive definite.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 12: Non-sampling functional approximation of linear and non-linear Bayesian Update

12

Numerical solution of the system

Lorenz63: q = (x, y, z), Y = (x, y, z) + ε.

For x (for y, z the same) the system above is (1 + 4+ 9)× (1 + 4+ 9) = 13× 13 large.But there are linear dependences (e. g. H2 12 = H2 21)!If remove linear dependent rows and columns, obtain 10× 10 system.Solution is ( H0 i, H1 i

1, H1 i

2, H1 i

3, H2 i

11, ..., H2 i

33), i = 1, 2, 3.

If ε = 0, then q = Y and system can be solved analytically 1 〈q〉 〈q⊗2〉〈q〉 〈q⊗2〉 〈q⊗3〉〈q⊗2〉 〈q⊗3〉 〈q⊗4〉

H0

H1

H2

=

〈q〉〈q ⊗ q〉〈q ⊗ q⊗2〉

And H0 = 0, H1 = I, H2 = 0.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 13: Non-sampling functional approximation of linear and non-linear Bayesian Update

13

Polynomial Chaos Expansion

Let Y (x,θ), θ = (θ1, ..., θM , ...), is approximated:

Y (x,θ) =∑

β∈Jm,p

Hβ(θ)Yβ(x),

Yβ(x) =1

β!

∫Θ

Hβ(θ)Y (x,θ)P(dθ).

In Lorenz63 model matrix of PCE coefficients is Y ∈ R3×|Jm,p|, |Jm,p| = (m+p)!m!p! .

E.g. multiindex for Y ⊗ Y will be Jm,2p and for Y ⊗4 will be Jm,4p.

If (Y − Z) is needed, for uncorrelated Y (Jm1,p) and Z (Jm2,p), then must buildJm1+m2,p.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 14: Non-sampling functional approximation of linear and non-linear Bayesian Update

14

MAIN RESULT 1: Special cases

For n = 0 (constant functions) ⇒ qa = H0 = 〈q〉 (= E (q)).

For n = 1 the approximation is with affine functions

H0 + H1 〈z〉 =〈q〉

H0 〈z〉+ H1 〈z ⊗ z〉=〈q ⊗ z〉

=⇒ (remember that [covqz] = 〈q ⊗ z〉 − 〈q〉 ⊗ 〈z〉 )

H0 = 〈q〉 − H1 〈z〉

H1 (〈z ⊗ z〉 − 〈z〉 ⊗ 〈z〉) =〈q ⊗ z〉 − 〈q〉 ⊗ 〈z〉

⇒ H1 = [covqz][covzz]−1 (Kalman’s solution),

H0 = 〈q〉 − [covqz][covzz]−1〈z〉,

and finally

qa = H0 + H1 z = 〈q〉+ [covqz][covzz]−1(z − 〈z〉).

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 15: Non-sampling functional approximation of linear and non-linear Bayesian Update

15

Measurement error

Let ε ∈ N(0, σ2I) be the Gaussian noise.Since RVs Y and ε are uncorrelated, obtain

〈Y + ε〉 = 〈Y 〉+ 〈ε〉 = 〈Y 〉

〈(Y + ε)⊗2〉 = 〈Y ⊗2〉+ 2〈Y 〉〈ε〉+ 〈ε⊗2〉 = 〈Y ⊗2〉+ σ2I

〈(Y + ε)⊗3〉 = 〈Y ⊗3〉+ 3〈Y 〉〈ε⊗2〉

〈(Y + ε)⊗4〉 = 〈Y ⊗4〉+ 6〈Y ⊗2〉〈ε⊗2〉+ 〈ε⊗4〉

(since 〈ε⊗(2k+1)〉 = 0).

One should be accurate in computing the RHS 〈q ⊗ Y 〉, 〈q ⊗ Y ⊗2〉, ... since RVs q andY belong to different overlapping spaces.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 16: Non-sampling functional approximation of linear and non-linear Bayesian Update

16

MAIN RESULT 2: Final Step of Non-linear BU

After H0 , H1 , H2 are computed, evaluate

qa = H0 + H1 (z − Y ) + H2 (z − Y )⊗2,

where z = z(ω) = y(ω) + ε(ω), Y are the noisy measurements.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 17: Non-sampling functional approximation of linear and non-linear Bayesian Update

17

Numerics

Abbildung 1: One update in the linearized Lorenz63.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 18: Non-sampling functional approximation of linear and non-linear Bayesian Update

18

Abbildung 2: Linear (left) and non-linear (right) Bayesian updates in

Lorenz63.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 19: Non-sampling functional approximation of linear and non-linear Bayesian Update

19

Conclusion about NLBU

• + Is functional, sampling free approximation.

• + Give a hope to be used for solving inverse problems.

• - Stochastic dimension grows up very fast.

• - Is very “expensive” in computing and requires efficient tensor arithmetics.

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing

Page 20: Non-sampling functional approximation of linear and non-linear Bayesian Update

20

Literature

1. O. Pajonk, B. V. Rosic, A. Litvinenko, and H. G. Matthies, A Deterministic Filterfor Non-Gaussian Bayesian Estimation, Physica D: Nonlinear Phenomena, Vol.241(7), pp. 775-788, 2012.

2. B. V. Rosic, A. Litvinenko, O. Pajonk and H. G. Matthies, Sampling Free LinearBayesian Update of Polynomial Chaos Representations, J. of Comput. Physics,Vol. 231(17), 2012 , pp 5761-5787

TU Braunschweig Institute of Scientific Computing

CCScien

tifi omputing