Batch Norm and Transfer Learning — whats going on?

after Ioffe and Szegedy (2015)
Fig. 1. weights for each minibatch (x) plotted as histogram with 40 bins (y).
Fig. 2. loss vs minibatch for training (blue) and validation (red) data for ‘Naive model’ without batch norm.
Fig. 3. histograms for ‘Naive model’
Fig. 4. loss vs minibatch for training (blue) and validation (red) data for ‘Naive model’ with batch norm.
Fig. 5. change in variance of weights per batch for each layer in the model. Batch Norm has a clear smoothing effect.
Fig. 6. loss for Unfreeze model.
Fig. 7. loss for Unfreeze non BN model
Fig. 8. Difference in weights for 3 epochs of Free model w.r.t Unfreeze model
Fig. 9. Difference in weights for 3 epochs of Freeze non BN model w.r.t Unfreeze non BN model.

--

--

--

Geophysicist and Deep Learning Practitioner

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

5 AI/ML Research Papers on Image Generation You Must Read

AlexNet Architecture

SageMaker Reinforcement Learning: From Zero to Hero

Getting Started with Feature Engineering for Supervised Learning

SWIR Sensor Tech Intro by IDTechEx

Attempting to Create a Pokédex: Pokémon Image Classification Using a Convolutional Neural Network

Optics: Metasurface color filters (MCFs) for 0.255um Pixels

What can Borges teach you about overfitting?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adrian G

Adrian G

Geophysicist and Deep Learning Practitioner

More from Medium

Shopee — Price Match Guarantee

Traffic forecasting, an amazing experience.

Memorizing Transformers ICLR 2022 Paper by Google

Exploring Federated Machine Learning in the Insurance sector