Discussion on Letures 3

Xijia Liu

Department of Statistics, Umeå University

Machine/Model

Machine/Model

Graphical Representation

PCA \(\to\) AutoEncoder

PCA \(\to\) AutoEncoder

E2E Learning and ANN

E2E Learning and ANN

E2E Learning and ANN

Training Problem

Training Problem

ANN \(\to\) DL

Shallow ANN:

Easier to train, more efficient
Simpler decision structure
Good enough theory

Deep ANN:

‘Arbitrarily’ powerful
More ‘meaningful’ feature extraction
More challenges

DL Challenges & Solutions: about model complexity

Any solutions to avoid overfitting problem?

DL Challenges & Solutions: about model complexity

Any solutions to avoid overfitting problem?

Data augmentation
Dropout learning; Early stopping
Pre-training; Pre-trained Model; Transfer Learning

DL Challenges & Solutions: about learning tricks

Pre-training V.S. Pre-trained Model V.S. Transfer Learning

DL Challenges & Solutions: about learning tricks

Pre-training V.S. Pre-trained Model V.S. Transfer Learning

DL Challenges & Solutions: about optimiation

ReLU function V.S. Sigmoid function
What is batch learning?
What is the ‘Epochs’ and ‘Batch_size’?
The smaller/larger the learning rate is, the better training we have. Is it correct?
Learning rate 0.01 is too small. Is it correct?