VALSEæ ¥å æ °ç -...

基于多视图深度生成式模型的视觉信息编解码研究

何晖光

2017-5-24中国科学院自动化研究所

中国科学院脑科学与智能技术卓越中心

研究背景

国内外研究现状

研究内容和结果

总结及展望

研究背景

It is important to study brain-inspired intelligence

Computational neuroscience Brian-machine interfaces

FacebookElon Musk’s Neurolink

研究背景

基于fMRI技术研究人脑视觉编解码机制

研究背景

基于

fMRI信号的视觉编解码

f(R|S)f(S|R) S RR

编码

解码编码

预测

观察

刺激

非线性映射

特征空间

线性优化

BOLD响应

BOLD响应

非线性映射

刺激

线性优化解码

研究背景

brain encoding

研究背景

brain decoding

研究背景

What is brain encoding and decoding ?

Encoding model： a model that predicts brain activity from external stimulus

Decoding model： a model that predicts external stimulus from brain activity

研究现状

视觉信息编解码的早期工作总结

客体分类:

Haxby et al., 2001 (Science)

研究现状

Wang CM et al., 2012 (J. Neural Eng.)

Fusiform Face Area

Parahippocampal Place Area

120张图片，92%1000张图片，82%

客体识别:

Kay et al., 2008 (Nature)

视觉信息重建:

Miyawaki, Y. et al., 2008 (Neuron)

研究现状

视觉信息重建:

Nishimoto et al., 2011 (Current Biology)

研究现状

Gallant CVPR 2015

语义重建:

研究现状

卷积自编码用于视觉信息编解码研究内容

Pathway for brain encoding


Pathway for brain encoding

Obj-Encoding ={reconstruction loss(image) ，regression loss(BOLD response)，weight penalty(L1 or L2)}


Pathway for brain decoding


Pathway for brain decoding

Obj-Decoding ={reconstruction loss(image)regression loss(feature map)weight penalty(L1 or L2)}

根据大脑信号重建图像研究内容

主要目标


Linear or nonlinear regression model

两阶段法

Approach 1: Convolutional auto-encoder + Regression


Linear ornonlinear model

But, how to keep the similarity ?

两阶段法

Need to be similar !


统一训练

Approach 2: Training convolutional auto-encoder and regression simultaneously

Obj = reconstruction loss(image) + regression loss(feature map) + weight penalty(L1 or L2)

研究内容

encoding

decoding

传统的编解码研究方法是单向的

研究内容

encoding

decoding

采用双向的研究思路是不是更好？

自编码约束下的深度相关性分析

多视图表示学习模型研究内容

多视图生成式自编码模型研究内容

Proposed method：Multi-View Deep Generative Model

SharedRepresentation

View 1

View 2

Key idea : learning the common latent feature, then generate stimuli and bold response simultaneously

How to train the model (learning model parameters) ?


Variational Auto-Encoder (VAE)

VAE is interesting generative model, which combinesideas from deep learning with statistical inference.

It can be used to learn a low dimensional representation Z of high dimensional data X such as images.

In contrast to standard auto encoders, X and Z are random variables.

Kingma and Welling. “Auto-Encoding Variational Bayes, InternationalConference on Learning Representations.” ICLR, 2014.arXiv:1312.6114 [stat.ML].


Principle Idea generative network

One Example: Wish to learn θ from the N training observations x(i) i=1,…,N

p(X Z)


A model for generative (decoder) network


Training use maximum likelihood of p(x) given the training data

Problem: cannot be calculated

Solution: • MCMC (too costly)• Approximate p(z|x) with q(z|x)

Training the generative network


A model for encoder network


How to build a multi-view VAE ?

Learning the parameters φ and θ via backpropagation


Multi-view VAE

Left part : multiple layer perceptrons (MLPs) or convolutional neural networks (CNNs)

Right part : DNNs or just linear model (avoid over-fitting and better interpretability)

Deep generative multi-view model (DGMM)


隐含变量Z的先验分布

图像视图X的似然

大脑信号视图Y的似然

辅助变量的先验分布



目标函数，极大似然估计求解



Variational autoencoder

Variational autoencoder



Brain encoding



Brain decoding

实验数据介绍研究内容

Data set 1: ‘neuron’

fMRI : 797 brain voxels (V1)

Stimuli : 1400 (10x10 pixels)

Subject : 1


Data set 2: handwritten digit (6 & 9)

fMRI : 3092 brain voxels (V1,V2,V3)

Stimuli : 100 (28 x 28 pixels)

Subject : 1


Data set 3 : handwritten characters (B, R, A, I, N, S)

fMRI : 2536 brain voxels (V1,V2)

Stimuli : 360 (56 x 56 pixels)

Subject : 3

对比方法介绍研究内容

Miyawaki et al.: Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron, 2008.

Bayesian CCA (BCCA) : Modular encoding and decoding models derived from Bayesian canonical correlation analysis. Neural Computation, 2013.

Deep Canonically Correlated Autoencoders (DCCAE) : On deep multi-view representation learning. ICML, 2015.

Deconvolutional Neural Network (De-CNN) : Neural encoding and decoding with deep learning for dynamic natural vision. arXiv:1608.03425v1, 2016.

• Experimental results (HBM 2017 Oral)

图像重建效果研究结果

• Experimental results

图像重建效果研究结果

重建效果定量评估研究结果

Experimental results on three fMRI datasets

框架的可扩展性介绍研究结果

Supervised encoding and decoding

框架的可扩展性介绍研究结果

Multi-subject encoding and decoding

• 提出了基于多视图生成模型的双向建模框架

• 在图像重构（信息解码）方面性能优异

• 由于自然图像刺激的fMRI样本量很少，目前复杂自然场景的重构效果还不理想

• 目前是静态编解码，下一步将采用动态编解码，比如变分RNN

• 解决编解码问题的方法可以借鉴机器翻译、图像翻译中的对偶学习思想

• 尝试其他类型的深度生成模型，如GAN等• GAN和VAE的结合也值得尝试

总结及展望

That’s interesting work with significant implications. The ability to reconstruct brain images is an important stepping stone in the work to create better brain-machine interfaces.

—— MIT Technology Review

参考文献：

Changde Du, Changying Du, Huiguang He*, Sharing deep generative representation for perceived image reconstruction from human brain activity, 2017https://arxiv.org/abs/1704.07575第一作者简介：

杜长德 Ph.D. candidate中科院自动化所类脑智能研究中心

[email protected]

谢谢！

[email protected]

VALSEæ ¥å æ °ç -...

Documents

Transcript of VALSEæ ¥å æ °ç -...