Start Simple

How to start simple with deep learning models?

Summary

Choose simple architecture:
- LeNet/ResNet for images.
- LSTM for sequences.
- Fully-connected network with one hidden layer for all other tasks.
Use sensible hyper-parameter defaults:
- Adam optimizer with a “magic” learning rate value of 3e-4.
- ReLU activation for fully-connected and convolutional models and TanH activation for LSTM models.
- He initialization for ReLU and Glorot initialization for TanH.
- No regularization and data normalization.
Normalize data inputs: subtracting the mean and dividing by the variance.
Simplify the problem:
- Working with a small training set around 10,000 examples.
- Using a fixed number of objects, classes, input size, etc.
- Creating a simpler synthetic training set like in research labs.

Last updated 3 years ago