AndrewNG-DL 应用

1 Setting up ML application

Training set
Validation/Development set: used for selecting model.
Test set: used for assessment of the generalization error of the final chosen model.

In previous era, we with limited data, we use 60/20/20 for tain/dev/test.
In Big data era, we use 99% of data as training set.

Make sure dev and test set come from same distribution.

If the output(base) error is high, 15% for example, then the above is not high bias.

Regularization helps to prevent overfitting, or reduce the errors in NN.

$$ J(w,b)=\frac{1}{m}\sum^{m}_{i=1}L(\hat{y}^{(i)},y^{(i)})+\frac{\lambda}{2m}||w||_2^2 $$

$\lambda$: regularization parameter
L2 regularization: $||w||_2^2$
- $\sum_{j=1}^{n_x}w_j^2=w^Tw$
- used most often.
L1 regularization: $||w||_1$
- $\sum_{j=1}^{n_x}|w_j|$
- w will be sparse (the vector w will have a lot of zeros in it)

$$ J(w,b)=\frac{1}{m}\sum^{m}{i=1}L(\hat{y}^{(1)},y^{(1)},…,\hat{y}^{(L)},y^{(L)})+\frac{\lambda}{2m}\sum^L{l=1}||w^{[l]}||_F^2 $$