Case
Studies
Why look at
deeplearning.ai
case studies?
Outline
Classic networks:
• LeNet-5
• AlexNet
• VGG
ResNet
Inception
Andrew Ng
Case Studies
Classic networks
deeplearning.ai
LeNet - 5
avg pool avg pool
⋮
"#
5 × 5 f=2 5 × 5 f=2 ⋮
s=1 s=2 s=1 s=2
32×32 ×1 28×28×6 14×14×6 10×10×16 5×5×16
120 84
[LeCun et al., 1998. Gradient-based learning applied to document recognition] Andrew Ng
AlexNet
MAX-POOL MAX-POOL
11 × 11 3 × 3 5 × 5 3 × 3
s=4 s=2 same s=2
55×55 ×96 27×27 ×96 27×27 ×256 13×13 ×256
227×227 ×3
MAX-POOL
= ⋮ ⋮ ⋮
3 × 3 3 × 3 3 × 3 3 × 3
s=2
Softmax
same
1000
13×13 ×384 13×13 ×384 13×13 ×256 6×6 ×256 9216 4096 4096
[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks] Andrew Ng
VGG - 16
CONV = 3×3 filter, s = 1, same MAX-POOL = 2×2 , s = 2
224×224×64 112×112 ×64 112×112 ×128 56×56 ×128
[CONV 64] POOL [CONV 128] POOL
×2 ×2
224×224 ×3
56×56 ×256 28×28 ×256 28×28 ×512 14×14×512
[CONV 256] POOL [CONV 512] POOL
×3 ×3
14×14 ×512 7×7×512 FC FC Softmax
[CONV 512] POOL 4096 4096 1000
×3
[Simonyan & Zisserman 2015. Very deep convolutional networks for large-scale image recognition] Andrew Ng
Case Studies
Residual Networks
deeplearning.ai
(ResNets)
Residual block
![#%(]
![#] ![#%&]
' [#%(] = * [#%(] ![#] + , [#%(] ![#%(] = -(' [#%(] ) ' [#%&] = * [#%&] ![#%(] + , [#%&] ![#%&] = -(' [#%&] )
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Residual Network
x ![#]
Plain ResNet
training error
training error
# layers # layers
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Case Studies
Why ResNets
deeplearning.ai
work
Why do residual networks work?
Andrew Ng
ResNet
Plain
ResNet
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Case Studies
Network in Network
deeplearning.ai
and 1×1 convolutions
Why does a 1 × 1 convolution do?
1 2 3 6 5 8
3 5 5 1 3 4
2 1 3 4 9 3
4 7 8 5 7 9
∗ 2 =
1 5 3 7 4 8
5 4 9 8 3 5
6 × 6
∗ =
6 × 6 × 32 1 × 1 × 32 6 × 6 × # filters
[Lin et al., 2013. Network in network] Andrew Ng
Using 1×1 convolutions
ReLU
CONV 1 × 1
32
28 × 28 × 32
28 × 28 × 192
[Lin et al., 2013. Network in network] Andrew Ng
Case Studies
Inception network
deeplearning.ai
motivation
Motivation for inception network
1×1
3×3
64
128
5×5 28
32
32
28
28 × 28 × 192 MAX-POOL
[Szegedy et al. 2014. Going deeper with convolutions] Andrew Ng
The problem of computational cost
CONV
5 × 5,
same,
32 28 × 28 × 32
28 × 28 × 192
Andrew Ng
Using 1×1 convolution
CONV CONV
1 × 1, 5 × 5,
16, 32,
1 × 1 × 192 28 × 28 × 16 5 × 5 × 16 28 × 28 × 32
28 × 28 × 192
Andrew Ng
Case Studies
Inception network
deeplearning.ai
Inception module
1×1
CONV
1×1 3×3
CONV CONV
Previous Channel
Activation Concat
1×1 5×5
CONV CONV
MAXPOOL
3 × 3,s = 1
1×1
same CONV
Andrew Ng
Inception network
[Szegedy et al., 2014, Going Deeper with Convolutions] Andrew Ng
http://knowyourmeme.com/memes/we-need-to-go-deeper Andrew Ng
Convolutional
Neural Networks
MobileNet
Motivation for MobileNets
• Low computational cost at deployment
• Useful for mobile and embedded vision
applications
• Key idea: Normal vs. depthwise-
separable convolutions
[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications] Andrew Ng
Normal Convolution
* =
3x3x3
4 4x x4 4x 5
6x6x3
Computational cost = #filter params x # filter positions x # of filters
Andrew Ng
Depthwise Separable Convolution
Normal Convolution
* =
Depthwise Separable Convolution
* * =
Depthwise Pointwise
Andrew Ng
Depthwise Convolution
* =
3x3 4x4x3
6x6x3
Computational cost = #filter params x # filter positions x # of filters
Andrew Ng
Depthwise Separable Convolution
Depthwise Convolution
* =
Pointwise Convolution
* =
Andrew Ng
Pointwise Convolution
* =
1x1x3
4x4x3 4 x4 4x x4 5
Computational cost = #filter params x # filter positions x # of filters
Andrew Ng
Depthwise Separable Convolution
Normal Convolution
* =
Depthwise Separable Convolution
* * =
Depthwise Pointwise
Andrew Ng
Cost Summary
Cost of normal convolution
Cost of depthwise separable convolution
[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications] Andrew Ng
Depthwise Separable Convolution
Depthwise Convolution
* =
Pointwise Convolution
* =
Andrew Ng
Convolutional
Neural Networks
MobileNet
Architecture
MobileNet
MobileNet v1
MobileNet v2
Residual Connection
Expansion Depthwise Projection
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet v2 Bottleneck
Residual Connection
Expansion Depthwise Pointwise
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet
MobileNet v1
MobileNet v2
Residual Connection
Expansion Depthwise Projection
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet v2 Full Architecture
conv2d avgpool conv2d
conv2d 1x1 7x7 1x1
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
Convolutional
Neural Networks
EfficientNet
EfficientNet
Baseline
𝑦ො
Wider
Higher
Deeper Resolution
Compound scaling
𝑦ො 𝑦ො 𝑦ො 𝑦ො
[Tan and Le, 2019, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks] Andrew Ng
Practical advice for
using ConvNets
Transfer Learning
deeplearning.ai
Practical advice for
using ConvNets
Data augmentation
deeplearning.ai
Common augmentation method
Mirroring
Random Cropping Rotation
Shearing
Local warping
…
Andrew Ng
Color shifting
+20,-20,+20
-20,+20,+20
+5,0,+50
Andrew Ng
Implementing distortions during training
Andrew Ng
Practical advice for
using ConvNets
The state of
deeplearning.ai
computer vision
Data vs. hand-engineering
Two sources of knowledge
• Labeled data
• Hand engineered features/network architecture/other components
Andrew Ng
Tips for doing well on benchmarks/winning
competitions
Ensembling
• Train several networks independently and average their outputs
Multi-crop at test time
• Run classifier on multiple versions of test images and average results
Andrew Ng
Use open source code
• Use architectures of networks published in the literature
• Use open source implementations if possible
• Use pretrained models and fine-tune on your dataset
Andrew Ng