AndrewNG-GAN


Course 1 —— Build Basic GANs

1.1 Introduction

image-20210327230913987

Generative Models:

  • Variational Autoencoders(VAE):

    image-20210327231211755

  • GANS:

    image-20210327231532258

GAN in Real Life

  • GAN的创始人:Ian Goodfellow
  • GAN的应用领域:
    • Image Generation, Deep fake
    • Text Generation
    • Data Augmentaion
    • Image Filters

1.2 Basic Components

Discriminator

  • Use Neural Networks, input: features(image), output: probability

    image-20210328103833111

    image-20210327233724319

  • 0.85这个概率也会交给Generator

  • Input features e.g.: RGB pixel values for images

Generator

  • Use Neural Networks, input: class+noise vector, output: features(image)

    image-20210328104045156

  • Generator目标是让Fake Example的Y^尽量接近1,而Discriminator目标是让其尽量接近0

  • 当训练的足够好时,Save Generator(θ), use random noise to sample more images.

  • The generator learns the probability of features X.

BCE Cost Function

  • image-20210328101716372
  • 前一半:当label y 为0时,为0;当label y 为1时,若Prediction接近1则为0,若Prediction接近0则为负无穷。

image-20210328101920175

  • 后一半:当label y 为1时,为0;当label y 为0时,若Prediction接近0则为1,若Prediction接近1则为负无穷。

image-20210328102207241

  • 综合起来,如果Prediction与label相比非常不准确,则最终的值很大。

Putting it Together

  • 最开始,Discriminator和Generator的水平应该相近。
  • Discriminator的任务难度比Generator更简单。
  • 若Discriminator过于强大,Generator生成的假图片都被判别为100%fake,100%fake对于Generator没有意义,因为其不知道向哪个方向改进。

Coding

  • PyTorch vs TensorFlow

image-20210328105037161

  • image-20210328110647182

  • Initialization of the model

    image-20210328111958874

  • Cost function

    image-20210328112034910

  • Optimizer: stochastic gradient descent (随机梯度下降),lr为learning rate

    image-20210328112129298

  • Training loop for number of epochs

    image-20210328112648135

1.3 More Components

Activations

  • Activation functions are non-linear (to approximate complex functions) and differentiable (for back propagation)

    • ReLU (Rectified Linear Unit)

      • Problem: Dying ReLU problem
    • Leaky ReLU:

      image-20210328221554677

    • Sigmoid:

      image-20210328221950385

      • often used for the last layer
      • Problem: vanishing gradient in saturation problems
    • Tanh (Hyperbolic Tangent):

      image-20210328222228586

    • between -1 and 1

Batch Normalization

applied on training data and test data.

image-20210328223933028

  • Batch normalization smooths the cost function
  • Batch normalization reduces the internal covariance shift
  • Batch normalization speeds up learning

Convolution

  • scan the image to detect useful features
  • Just element-wise products and sums

Stride&Padding

  • Stride: determines how the filter scans the image

  • Padding: gives similar importance to the edges and the center

    image-20210331164954722

Pooling&Upsampling

  • Pooling: reduces the size of the input

  • Upsampling: increases the size of the input

  • Difference with convolution: Pooling and Upsampling have no learnable parameters, so they involve no learning.

    image-20210331170057623

Transposed Convolution

image-20210331170744678

  • have learnable parameters
  • Problem: results have a checkerboard pattern

1.4 Some Problems of Traditional GANs

Mode Collapse

  • Mode Collapse happens when the generator gets stuck in one mode.

Problems with BCE loss

  • Flat regions on the cost function = vanishing gradients

1.5 Conditional Generation