Eawag: Swiss Federal Institute of Aquatic Science and Technology
Recurrent Neuronal Network tailored forWeather Radar Nowcasting
June 23, 2016
Andreas Scheidegger
Andreas Scheidegger – Eawag
Weather Radar Nowcasting
t0t
-1t-2t
-3t-4
t5t
4t3t
2t1
nowcastmodel
available inputs:every pixel of the previous images
desired outputs:every pixel of the next n images
Andreas Scheidegger – Eawag
Recurrent ANN
Pt
Ht Ht+1
ANN
Ht+2 Ht+3
Pt+3^
ANN ANN
Pt+1^ Pt+2
^
last observedimage
Andreas Scheidegger – Eawag
Recurrent ANN
Pt
Ht Ht+1
ANN
Ht+2 Ht+3 Ht+4
Pt+3^ Pt+4
^
ANN ANN ANN
Pt+1^ Pt+2
^
last observedimage
Andreas Scheidegger – Eawag
Model structure
Pt
Ht Ht+1
ANN
Pt+1^
v j=σ(∑k
w jkuk+b j)
Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:
Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:
How can we do better?
Andreas Scheidegger – Eawag
Model structure
Pt
Ht Ht+1
ANN
Pt+1^
v j=σ(∑k
w jkuk+b j)
Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:
Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:
How can we do better?
tailor model structure to problem
Andreas Scheidegger – Eawag
Model structure
convolution
Pt
Ht
Pt+1
Ht+1
+
fullyconnected
fullyconnected
fullyconnected
spatialtransformer deconvolution
^
Gaussian blur
fullyconnected
Andreas Scheidegger – Eawag
Spatial transformer
convolution
Pt
Ht
Pt+1
Ht+1
+
fullyconnected
fullyconnected
fullyconnected
spatialtransformer deconvolution
^
Gaussian blur
fullyconnected
Jad
erb
erg
, M
., S
imo
nya
n,
K.,
Zis
serm
an
, A.,
an
d K
avu
kcu
oglu
, K
. (2
01
5)
Sp
atia
l Tra
nsf
orm
er
Ne
two
rks.
arX
iv:1
50
6.0
20
25
[cs
].
Andreas Scheidegger – Eawag
Spatial Transformer
input output
Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015) Spatial Transformer Networks. arXiv:1506.02025 [cs].
Andreas Scheidegger – Eawag
Spatial transformer
convolution
Pt
Ht
Pt+1
Ht+1
+
fullyconnected
fullyconnected
fullyconnected
spatialtransformer deconvolution
^
Gaussian blur
fullyconnected
Andreas Scheidegger – Eawag
Local correction
convolution
Pt
Ht
Pt+1
Ht+1
+
fullyconnected
fullyconnected
fullyconnected
spatialtransformer deconvolution
^
Gaussian blur
fullyconnected
Andreas Scheidegger – Eawag
Gaussian blur
convolution
Pt
Ht
Pt+1
Ht+1
+
fullyconnected
fullyconnected
fullyconnected
spatialtransformer deconvolution
^
Gaussian blur
fullyconnected
Andreas Scheidegger – Eawag
Recurrent ANN
Pt
Ht Ht+1
ANN
Ht+2 Ht+3 Ht+4
Pt+3^ Pt+4
^
ANN ANN ANN
Pt+1^ Pt+2
^
last observedimage
Andreas Scheidegger – Eawag
Model structure
convolution
Pt
Ht
Pt+1
Ht+1
+
fullyconnected
fullyconnected
fullyconnected
spatialtransformer deconvolution
^
Gaussian blur
fullyconnected
Andreas Scheidegger – Eawag
Deep learning may be a hype – but it's a useful tool nevertheless!
Conclusions
Andreas Scheidegger – Eawag
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
Deep learning may be a hype – but it's a useful tool nevertheless!
Conclusions
Andreas Scheidegger – Eawag
and Outlook
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
Deep learning may be a hype – but it's a useful tool nevertheless!
Conclusions
Andreas Scheidegger – Eawag
and Outlook
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
θt+1=θt−λ ∇ L(θt)Online parameter adaptionDeep learning may be a
hype – but it's a useful tool nevertheless!
Conclusions
Andreas Scheidegger – Eawag
and Outlook
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
θt+1=θt−λ ∇ L(θt)Online parameter adaption
L(θ)Other objective function?
Deep learning may be a hype – but it's a useful tool nevertheless!
Conclusions
Andreas Scheidegger – Eawag
and Outlook
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
θt+1=θt−λ ∇ L(θt)Online parameter adaption
L(θ)Other objective function?
Deep learning may be a hype – but it's a useful tool nevertheless!
Include other inputs
Conclusions
Andreas Scheidegger – Eawag
and Outlook
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
θt+1=θt−λ ∇ L(θt)Online parameter adaption
L(θ)Other objective function?
Deep learning may be a hype – but it's a useful tool nevertheless!
Include other inputs
Predict prediction uncertainty
Conclusions
Andreas Scheidegger – Eawag
and Outlook
v j=σ(∑
k
w jku k+b j)
Combine domain knowledge with data driven modeling
θt+1=θt−λ ∇ L(θt)Online parameter adaption
L(θ)Other objective function?
Deep learning may be a hype – but it's a useful tool nevertheless!
Include other inputs
Interested in collaboration? [email protected]
Predict prediction uncertainty
Conclusions
Andreas Scheidegger – Eawag
Training and Implementaion
θnew=θold−λ ∇ L(θold)
Loss function
Scheduled Learning
Regularisation
Drop-out
L(θ)=∫0
∞
RMS (τ ;θ) p(τ)d τ Runs on cuda compatible GPUs
Based on Python library chainer
Srivastava, et al. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1929–1958.
Stochastic Gradient Decent
Implementation
Bengio, et al. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:1506.03099.
Tokui, et al., (2015). Chainer: a Next-Generation Open Source Framework for Deep Learning.
Andreas Scheidegger – Eawag
Training
Pt
Ht Ht+1
Pt+1
ANN
Ht+2 Ht+3 Ht+4
Pt+3^ Pt+4
^
ANN ANN ANN
Pt+2
training predictions
observed images predicted images
Andreas Scheidegger – Eawag
Scheduled Training
Pt
Ht Ht+1
Pt+1ANN
Ht+2 Ht+3 Ht+4
Pt+3^ Pt+4
^
ANN ANN ANN
Pt+1^
or
Pt+2
Pt+2^
or
training predictions
observed or predicted images predicted images
Bengio, Set al. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:1506.03099.
Andreas Scheidegger – Eawag
Traditional model structures
Estimate velocity field and translate last radar image (optical flow)
Identify rain cells, track and extrapolate them
→ seems to be the most “natural” approach
→ can model grows and decay of cells
● Models can be difficult to tune● Local features (e.g. mountain range) cannot be modeled
Convolution layers
https:// developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html
Convolution layers
https:// developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html https://en.wikipedia.org/wiki/Kernel_(im
age_processing)
Top Related