Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle,...
Transcript of Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle,...
![Page 1: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/1.jpg)
Introduction to Generative Adversarial Network (GAN)
Hongsheng Li
Department of Electronic Engineering
Chinese University of Hong Kong
Adversarial – adj. 對抗的 1
![Page 2: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/2.jpg)
Generative Models
• Density Estimation
– Discriminative model:
• y=0 for elephant, y=1 for horse
– Generative model:
Elephant (y=0) Horse (y=1)
)|( xyp
)0|( yxp )1|( yxp
)|( yxp
2
![Page 3: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/3.jpg)
Generative Models
•
• Sample Generation
Training samples
Model samples
3
![Page 4: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/4.jpg)
Generative Models
• Sample Generation
Training samples
Model samples
Training samples
4
![Page 5: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/5.jpg)
Generative Models
• Generative model
• GAN is a generative model
– Mainly focuses on sample generation
– Possible to do both
Data datap elpmod Sample generation
5
![Page 6: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/6.jpg)
Why Worth Studying?
• Excellent test of our ability to use high-dimensional, complicated probability distributions
• Missing data
– Semi-supervised learning
elpmod Sample generation
6
![Page 7: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/7.jpg)
Why Worth Studying?
• Multi-modal outputs
– Example: next frame prediction
Lotter et al. 2015 7
![Page 8: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/8.jpg)
Why Worth Studying?
• Image generation tasks
– Example: single-image super-resolution
Ledig et al 2015 8
![Page 9: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/9.jpg)
Why Worth Studying?
• Image generation tasks
– Example: Image-to-Image Translation
– https://affinelayer.com/pixsrv/
Isola et al 2016 9
![Page 10: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/10.jpg)
Why Worth Studying?
• Image generation tasks
– Example: Text-to-Image Generation
Zhang et al 2016
10
![Page 11: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/11.jpg)
How does GAN Work?
• Adversarial – adj. 對抗的
• Two networks:
– Generator G: creates (fake) samples that the discriminator cannot distinguish
– Discriminator D: determine whether samples are fake or real
Generator Discriminator compete
11
![Page 12: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/12.jpg)
The Generator
• G: a differentiable function
– modeled as a neural network
• Input:
– z: random noise vector from some simple prior distribution
• Output:
– x = G(z): generated samples
Generator z x
12
![Page 13: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/13.jpg)
The Generator
• The dimension of z should be at least as large as that of x
Generator z G(z)=x
elpmod
datap~
13
![Page 14: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/14.jpg)
The Discriminator
• D: modeled as a neural network
• Input:
– Real sample
– Generated sample x
• Output:
– 1 for real samples
– 0 for fake samples
Discriminator x 0
Real data 1 14
![Page 15: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/15.jpg)
Generative Adversarial Networks
15
![Page 16: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/16.jpg)
Cost Functions
• The discriminator outputs a value D(x) indicating the chance that x is a real image
• For real images, their ground-truth labels are 1. For generated images their labels are 0.
• Our objective is to maximize the chance to recognize real images as real and generated images as fake
• The objective for generator can be defined as
16
![Page 17: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/17.jpg)
Cost Functions
• For the generator G, its objective function wants the model to generate images with the highest possible value of D(x) to fool the discriminator
• The cost function is
• The overall GAN training is therefore a min-max
game
17
![Page 18: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/18.jpg)
Training Procedure • The generator and the discriminator are learned
jointly by the alternating gradient descent
– Fix the generator’s parameters and perform a single iteration of gradient descent on the discriminator using the real and the generated images
– Fix the discriminator and train the generator for another single iteration
18
![Page 19: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/19.jpg)
The Algorithm
19
![Page 20: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/20.jpg)
Illustration of the Learning
• Generative adversarial learning aims to learn a model distribution that matches the actual data distribution
Discriminator
Data Model distribution
20
![Page 21: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/21.jpg)
Generator diminished gradient
• However, we encounter a gradient diminishing problem for the generator. The discriminator usually wins early against the generator
• It is always easier to distinguish the generated images from real images in early training. That makes cost function approaches 0. i.e. -log(1 -D(G(z))) → 0
• The gradient for the generator will also vanish which makes the gradient descent optimization very slow
• To improve that, the GAN provides an alternative function to backpropagate the gradient to the generator
21
minimize maximize
![Page 22: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/22.jpg)
Comparison between Two Losses
22
![Page 23: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/23.jpg)
Non-Saturating Game
•
•
• In the min-max game, the generator maximizes
the same cross-entropy
• Now, generator maximizes the log-probability of the discriminator being mistaken
• Heuristically motivated; generator can still learn even when discriminator successfully rejects all generator samples
23
![Page 24: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/24.jpg)
Deep Convolutional Generative Adversarial Networks (DCGAN)
• All convolutional nets
• No global average pooling
• Batch normalization
• ReLU
Radford et al. 2016 24
![Page 25: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/25.jpg)
Deep Convolutional Generative Adversarial Networks (DCGAN)
Radford et al. 2016
• LSUN bedroom (about 3m training images)
25
![Page 26: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/26.jpg)
Manipulating Learned z
26
![Page 27: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/27.jpg)
Manipulating Learned z
27
![Page 28: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/28.jpg)
Image Super-resolution with GAN
Ledig et al. 2016
28
![Page 29: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/29.jpg)
Image Super-resolution with GAN
29
![Page 30: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/30.jpg)
Image Super-resolution with GAN
30
![Page 31: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/31.jpg)
Image Super-resolution with GAN
bicubic SRResNet SRGAN original
31
![Page 32: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/32.jpg)
Context-Encoder for Image Inpainting
• For a pre-defined region, synthesize the image
contents
Pathak et al 2016
32
![Page 33: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/33.jpg)
Context-Encoder for Image Inpainting
• For a pre-defined region, synthesize the image
contents
Pathak et al 2016 33
![Page 34: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/34.jpg)
Context-Encoder for Image Inpainting
• Overall framework
Original region
Synthetic region
34
![Page 35: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/35.jpg)
Context-Encoder for Image Inpainting
• The objective
35
![Page 36: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/36.jpg)
Context-Encoder for Image Inpainting
36
![Page 37: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/37.jpg)
Image Inpainting with Partial Convolution
Liu 2016
• Partial convolution for handling missing data
• L1 loss: minimizing the pixel differences between the generated image and their ground-truth images
• Perceptual loss: minimizing the VGG features of the generated images and their ground-truth images
• Style loss (Gram matrix): minimizing the gram matrices of the generated images and their ground-truth images
37
![Page 38: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/38.jpg)
Image Inpainting with Partial Convolution: Results
Liu 2016
38
![Page 39: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/39.jpg)
Texture Synthesis with Patch-based GAN
Liu et al. 2018
• Synthesize textures for input images
39
![Page 40: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/40.jpg)
Texture Synthesis with Patch-based GAN
Li and Wand 2016
MSE Loss Adv loss
40
![Page 41: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/41.jpg)
Texture Synthesis with Patch-based GAN
Li and Wand 2016
41
![Page 42: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/42.jpg)
Texture Synthesis with Patch-based GAN
Li and Wand 2016
42
![Page 43: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/43.jpg)
Conditional GAN
• GAN is too free. How to add some constraints?
• Add conditional variables y into the generator
Mirza and Osindero 2016
Training samples
Model samples
43
![Page 44: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/44.jpg)
Conditional GAN
• GAN is too free. How to add some constraints?
• Add conditional variables y into G and D
Mirza and Osindero 2016 44
![Page 45: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/45.jpg)
Conditional GAN
Mirza and Osindero 2016 45
![Page 46: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/46.jpg)
Conditional GAN
Mirza and Osindero 2016
0 1 0 0 0 0 0 0 0 0
46
![Page 47: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/47.jpg)
Conditional GAN
Mirza and Osindero 2016
• Positive samples for D
– True data + corresponding conditioning variable
• Negative samples for D
– Synthetic data + corresponding conditioning variable
– True data + non-corresponding conditioning variable
47
![Page 48: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/48.jpg)
Text-to-Image Synthesis
Reed et al 2015
48
![Page 49: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/49.jpg)
StackGAN: Text to Photo-realistic Images
Zhang et al. 2016
• How humans draw a figure?
– A coarse-to-fine manner
49
![Page 50: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/50.jpg)
StackGAN: Text to Photo-realistic Images
Zhang et al. 2016
• Use stacked GAN structure for text-to-image synthesis
50
![Page 51: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/51.jpg)
StackGAN: Text to Photo-realistic Images
• Use stacked GAN structure for text-to-image synthesis
51
![Page 52: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/52.jpg)
StackGAN: Text to Photo-realistic Images
• Conditioning augmentation
• No random noise vector z for Stage-2
• Conditioning both stages on text help achieve better results
• Spatial replication for the text conditional variable
• Negative samples for D – True images + non-corresponding texts
– Synthetic images + corresponding texts
52
![Page 53: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/53.jpg)
Conditioning Augmentation
• How train parameters like the mean and variance of a Gaussian distribution
•
• Sample from standard Normal distribution
• Multiple with and then add with
• The re-parameterization trick
),( 00 N
0
)1,0(N
0
53
![Page 54: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/54.jpg)
More StackGAN Results on Flower
54
![Page 55: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/55.jpg)
More StackGAN Results on COCO
55
![Page 56: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/56.jpg)
StackGAN-v2: Architecture
• Approximate multi-scale image distributions jointly
• Approximate conditional and unconditional image distributions jointly
56
![Page 57: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/57.jpg)
StackGAN-v2: Results
57
![Page 58: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/58.jpg)
Progressive Growing of GAN
• Share the similar spirit with StackGAN-v1/-v2 but use a different training strategy
58
![Page 59: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/59.jpg)
Progressive Growing of GAN
• Impressively realistic face images
59
![Page 60: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/60.jpg)
Image-to-Image Translation with Conditional GAN
Isola et al. 2016 60
![Page 61: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/61.jpg)
Image-to-Image Translation with Conditional GAN
• Incorporate L1 loss into the objective function
• Adopt the U-net structure for the generator
Encoder-decoder Encoder-decoder with skips
61
![Page 62: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/62.jpg)
Patch-based Discriminator
• Separate each image into N x N patches
• Instead of distinguish whether the whole image is real or fake, train a patch-based discriminator
62
![Page 63: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/63.jpg)
More Results
63
![Page 64: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/64.jpg)
More Results
64
![Page 65: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/65.jpg)
CycleGAN
• All previous methods require to have paired training data, i.e., exact input-output pairs, which can be extremely difficult to obtain in practice
![Page 66: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/66.jpg)
CycleGAN
• The framework learns two mapping functions (generators) G : X → Y and F : Y → X with two domain discriminators DX and DY
![Page 67: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/67.jpg)
CycleGAN: Results
![Page 68: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/68.jpg)
CycleGAN: Results
![Page 69: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/69.jpg)
S2-GAN: Decomposing difficult problems into subproblems
• Generating indoor images
• Generating surface normal map + surface style map
69
![Page 70: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/70.jpg)
Style-GAN
70
![Page 71: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/71.jpg)
S2-GAN Results
71
![Page 72: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/72.jpg)
Insights
• Some insights
– Decomposing the problems into easier problems
– Spatially well-aligned conditioning variables are generally better
72
![Page 73: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/73.jpg)
Non-convergence in GANs
• Finding equilibrium is a game of two players
• Exploiting convexity in function space, GAN training is theoretically guaranteed to converge if we can modify the density functions directly, but: – Instead, we modify G (sample generation function) and D (density
ratio), not densities
– We represent G and D as highly non-convex parametric functions
• “Oscillation”: can train for a very long time, generating very many different categories of samples, without clearly generating better samples
• Mode collapse: most severe form of non-convergence
73
![Page 74: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/74.jpg)
Mode Collapse
• D in inner loop: convergence to correct distribution
• G in inner loop: place all mass on most likely point
Metz et al 2016 74
![Page 75: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/75.jpg)
Mode Collapse Causes Low Output Diversity
75
![Page 76: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/76.jpg)
Conditioning Augmentation
76
![Page 77: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/77.jpg)
Minibatch Features
• Add minibatch features that classify each example by comparing it to other members of the minibatch (Salimans et al 2016)
• Nearest-neighbor style features detect if a minibatch contains samples that are too similar to each other
77
![Page 78: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/78.jpg)
Minibatch GAN on CIFAR
78
![Page 79: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/79.jpg)
Minibatch GAN on ImageNet
79
![Page 80: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/80.jpg)
Cherry-picked Results
80
![Page 81: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/81.jpg)
Problems with Counting
81
![Page 82: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/82.jpg)
Problems with Perspective
82
![Page 83: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/83.jpg)
83
![Page 84: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/84.jpg)
Unrolled GANs
• A toy example
Metz et al 2016 84
![Page 85: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/85.jpg)
• Normalize the inputs
– normalize the images between -1 and 1
– Tanh as the last layer of the generator output
• Modified loss function
– Because of the vanishing gradients (Goodfellow et al 2014). Use
– Flip labels when training generator: real = fake, fake = real
Tips and Tricks for training GAN
Soumith et al 2016 85
![Page 86: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/86.jpg)
• Use a spherical Z
– Don’t sample from a Uniform distribution
– Sample from a Gaussian distribution
– When doing interpolations, do the interpolation via a great circle, rather than a straight line (White et al 2016)
Tips and Tricks for training GAN
86
![Page 87: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/87.jpg)
• Batch normalization
– Compute mean and standard deviation of features
– Normalize features (subtract mean, divide by standard deviation)
Tips and Tricks for training GAN
Soumith et al 2016 87
![Page 88: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/88.jpg)
• Batch normalization in G
Tips and Tricks for training GAN
Soumith et al 2016 88
![Page 89: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/89.jpg)
• Reference Batch Normalization
– Fix a reference batch
– Given new inputs
– Normalize the features of X using the mean and standard deviation from R
– Every is always treated the same, regardless of which other examples appear in the minibatch
Tips and Tricks for training GAN
Salimens et al 2016 89
![Page 90: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/90.jpg)
• Virtual Batch Normalization
– Reference batch norm can overfit to the reference batch. A partial solution is virtual batch norm
– Fix a reference batch
– Given new inputs
– For each
• Construct a minibatch containing and all R
• Compute mean and standard deviation of V
• Normalize the features of using the mean and standard deviation
Tips and Tricks for training GAN
Salimens et al 2016 90
![Page 91: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/91.jpg)
Tips and Tricks for training GAN
• Use Adam optimizer – Use SGD for discriminator & Adam for generator
• Avoid Sparse Gradients: ReLU, MaxPool – the stability of the GAN game suffers if you have
sparse gradients – LeakyReLU = good (in both G and D) – For Downsampling, use: Average Pooling, Conv2d +
stride – For Upsampling, use: Bilinear Interpolation,
PixelShuffle
91
![Page 92: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/92.jpg)
Tips and Tricks for training GAN
• Use Soft and Noisy Labels
– Default cost
– Label Smoothing (Salimans et al. 2016)
– For real ones (label=1), replace it with 0.9; For fake ones (label=0), keep it to 0.
92
![Page 93: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/93.jpg)
Tips and Tricks for training GAN
• Use Soft and Noisy Labels
– make the labels noisy for the discriminator:
occasionally flip the labels when training the discriminator
93
![Page 94: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/94.jpg)
Tips and Tricks for training GAN
• Track failures early
– D loss goes to 0: failure mode
– Check norms of gradients: > 100 is bad
– When training well, D loss has low variance and is going down. Otherwise, D loss is spiky
– If loss of G steadily decreases, then it’s fooling D with garbage
94
![Page 95: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/95.jpg)
Tips and Tricks for training GAN
• Discrete variables in Conditional GANs
– Use an Embedding layer
– Add as additional channels to images
– Keep embedding dimensionality low and upsample to match image channel size
95
![Page 96: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/96.jpg)
Tips and Tricks for training GAN
• Use label information when possible
– Used as conditioning variable
– Auxiliary classifier GAN
Conditional GAN AC-GAN 96
![Page 97: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/97.jpg)
Tips and Tricks for training GAN
• Balancing G and D
– Usually the discriminator “wins”
– Good thing: theoretical justification are based on assuming D is perfect
– Usually D is bigger and deeper than G
– Sometimes run D more often than G. Mixed results
– Do not try to limit D to avoid making it “too smart”
• Use non-saturating cost
• Use label smoothing
97
![Page 98: Generative Adversarial Network (GAN)dl.ee.cuhk.edu.hk/slides/GAN_slides.pdf · via a great circle, rather than a straight line (White et al 2016) Tips and Tricks for training GAN](https://reader033.fdocument.pub/reader033/viewer/2022050718/5e17261c437978463f264e43/html5/thumbnails/98.jpg)
Research Directions
• Research direction of GANs
– Better network structures
– Better objective functions
– Novel problem setups
– Use of adversarial losses in other CV/ML applications
– Theories on GAN
98