AWS GPU Setup for Deep Learning
AWS GPU Setup for Deep Learning
Back to Home
01. Overview
02. Create an AWS Account
03. Get Access to GPU Instances
04. Apply Credits
05. Launch an Instance
06. Login to the Instance
07. More Resources
01. Overview
Training and evaluating deep neural networks is a computationally intensive task. For
modestly sized problems and datasets it may be possible to train your network on the CPU in
your local machine, but it could take anywhere from 15 minutes to several hours depending on
the number of epochs, the size of the neural networks, and other factors. A faster alternative is
to train on a GPU (Graphics Processing Unit), which is a type of processor that supports
greater parallelism.
If you do not already have a computer with a built-in NVIDIA GPU, we suggest you use
an Amazon EC2 instance. There are many cloud service providers that offer equivalent
functionality, but EC2 is a reasonable default that is available to most students. In the next few
sections, we'll go over the steps from nothing to running a neural network on an Amazon
server.
Note: Please skip this section if you are planning to use your own GPU or CPU (or
otherwise not planning to use Amazon EC2).
Note: We do not currently provide AWS credits to MLND subscription students.
02. Create an AWS Account
Create an AWS Account
When you sign up, you will need to provide a credit card. But don’t worry, you won’t be
charged for anything yet.
Furthermore, when you sign up, you will also need to choose a support plan. You can choose
the free Basic Support Plan.
Once you finish signing up, wait a few minutes to receive your AWS account confirmation
email. Then return to aws.amazon.com and sign in.
Amazon Web Services has a service called Elastic Compute Cloud (EC2), which allows you to
launch virtual servers (or “instances”), including instances with attached GPUs. The specific
type of GPU instance you should launch for this tutorial is called “p2.xlarge”.
We will use this AMI (Amazon Machine Image) to define the operating system for your
instance, and to make use of its pre-installed software. In order to use this AMI,
you must change your AWS region to one of the following (and you are encouraged to select
the region in the list that is closest to you):
EU (Ireland)
Asia Pacific (Seoul)
Asia Pacific (Tokyo)
Asia Pacific (Sydney)
US East (N. Virginia)
US East (Ohio)
US West (Oregon)
After changing your AWS region, view your EC2 Service Limit report
at: https://console.aws.amazon.com/ec2/v2/home?#Limits, and find your "Current Limit" for
the p2.xlarge instance type.
By default, AWS sets a limit of 0 on the number of p2.xlarge instances a user can run, which
effectively prevents you from launching this instance.
If your limit of p2.xlarge instances is 0, you'll need to increase the limit before you can launch
an instance. From the EC2 Service Limits page, click on “Request limit increase” next to
“p2.xlarge”.
You will not be charged for requesting a limit increase. You will only be charged once you
actually launch an instance.
On the service request form, you will need to complete several fields.
For the “Region” field, select the AWS region you chose in Step 2.
For the “New limit value”, enter a value of 1 (or more, if you wish).
For the “Use Case Description”, you can simply state: “I would like to use GPU instances for
deep learning.”
04. Apply Credits
Apply Credits
All students are provided with AWS credits to train their neural networks. To get your AWS
credits, go to the 'Resources' tab in the left of the classroom; there will be an 'AWS Credits'
link to click on there.
This should bring up a page like the one below. Fill in the data for this page. In your AWS
account, your AWS Account ID can be found under 'My Account', which is itself found in
the dropdown under the name of your account, between the bell and the 'Global' dropdown.
After you've gone through all the steps, you'll receive an email with your code. The email will
be sent to the email address you entered on the AWS credits application. It may take 48 hours
to receive this email, though it is much quicker in most cases
Under "AWS Promotional Credit " in the email, you'll find your code. Use this code on the
link provided, which is your account credits page.
05. Launch an Instance
Launch an Instance
Once AWS approves your GPU Limit Increase Request, you can start the process of launching
your instance.
Visit the EC2 Management Console, and click on the “Launch Instance” button.
Next, you must choose an AMI (Amazon Machine Image) which defines the operating system
for your instance, as well as any configurations and pre-installed software.
Click on AWS Marketplace, and search for Deep Learning AMI with Source Code (CUDA
8, Ubuntu). Once you find the appropriate AMI, click on the "Select" button.
This Amazon Machine Image (AMI) contains all the environment files and drivers for you to
train on a GPU. It has cuDNN, and many the other packages required for this course. Any
additional packages required for specific projects will be detailed in the appropriate project
instructions.
You must next choose an instance type, which is the hardware on which the AMI will run.
Filter the instance list to only show “GPU compute”:
Once you see “2/2 checks passed” on the EC2 Management Console, your instance is ready
for you to log in.
Note the "IPv4 Public IP" address (in the format of “X.X.X.X”) on the EC2 Dashboard.
From a terminal, navigate to the location where you stored your .pem file. (For example, if you
put your .pem file on your Desktop, cd ~/Desktop/ will move you to the correct directory.)
Note that if you've used a different AMI or specified a username, ubuntu will be replaced
with the username, such as ec2-user for some Amazon AMI's. You would then instead
enter ssh -i YourKeyName.pem [email protected]
In your instance, in order to create a config file for your Jupyter notebook settings,
type: jupyter notebook --generate-config .
Then, to change the IP address config setting for notebooks (this is just a fancy one-line
command to perform an exact string match replacement; you could do the same thing
manually using vi/vim/nano/etc.), type: sed -ie "s/#c.NotebookApp.ip =
'localhost'/#c.NotebookApp.ip = '*'/g" ~/.jupyter/jupyter_notebook_config.py
Make sure everything is working properly by verifying that the instance can run a Keras
notebook.
You will need the token generated by your jupyter notebook to access it. On your instance
terminal, there will be the following line:
Copy/paste this URL into your browser when you connect for the first
time, to login with a token: . Copy everything starting with the :8888/?token= .
Access the Jupyter notebook index from your web browser by visiting: X.X.X.X:8888/?
token=... (where X.X.X.X is the IP address of your EC2 instance and everything starting
with :8888/?token= is what you just copied).
Click on a folder, like "mnist", to enter it and select a notebook, such as the
"mnist_mlp.ipynb" notebook.
Run each cell in the notebook.
For some notebooks, you should see a marked decrease in training time when compared to
running the same cells using a typical CPU!
NOTE: Windows users may prefer connecting via the GUI utility PuTTY, by following these
instructions.
Important: Cost
**From this point on, AWS will charge you for a running an EC2 instance. ** You can find
the details on the EC2 On-Demand Pricing page.
Most importantly, remember to stop (i.e. shutdown) your instances when you are not using
them. Otherwise, your instances might run for a day or a week or a month without you
remembering, and you’ll wind up with a large bill!
AWS charges primarily for running instances, so most of the charges will cease once you stop
the instance. However, there are smaller storage charges that continue to accrue until you
“terminate” (i.e. delete) the instance.
Deep learning is not well-understood, and the practice is ahead of the theory in many cases. If
you are new to deep learning, you are strongly encouraged to experiment with many models,
to develop intuition about why models work. Starter code is provided on github.
In this mini project, you'll modify the neural network in mnist_mlp.ipynb and compare the
resultant model configurations.
Remember: Overfitting is detected by comparing the validation loss to the training loss. If the
training loss is much lower than the validation loss, then the model might be overfitting.
Instructions
Task Description:
Take note of the validation loss and test accuracy for the model resulting from each of the
below changes. (Try out each amendment in a separate model.)
Task List:
Increase (or decrease) the number of nodes in each of the hidden layers. Do you notice
evidence of overfitting (or underfitting)?
Increase (or decrease) the number of hidden layers. Do you notice evidence of overfitting (or
underfitting)?
Remove the dropout layers in the network. Do you notice evidence of overfitting?
Remove the ReLU activation functions. Does the test accuracy decrease?
Remove the image pre-processing step with dividing every pixel by 255. Does the accuracy
decrease?
Try a different optimizer, such as stochastic gradient descent.
Increase (or decrease) the batch size.
Task Feedback:
Great job! Now, you're ready to learn about Convolutional Neural Networks (CNNs)!
Optional Resource
Image source: http://iamaaditya.github.io/2016/03/one-by-one-convolution/
strides - The stride of the convolution. If you don't specify anything, strides is set to 1 .
padding - One of 'valid' or 'same' . If you don't specify anything, padding is set
to 'valid' .
activation - Typically 'relu' . If you don't specify anything, no activation is applied. You
are strongly encouraged to add a ReLU activation function to every convolutional layer in
your networks.
NOTE: It is possible to represent both kernel_size and strides as either a number or a
tuple.
When using your convolutional layer as the first layer (appearing after the input layer) in a
model, you must provide an additional input_shape argument:
input_shape - Tuple specifying the height, width, and depth (in that order) of the input.
NOTE: Do not include the input_shape argument if the convolutional layer is not the first
layer in your network.
There are many other tunable arguments that you can set to change the behavior of your
convolutional layers. To read more about these, we recommend perusing the
official documentation.
Example #1
Say I'm constructing a CNN, and my input layer accepts grayscale images that are 200 by 200
pixels (corresponding to a 3D array with height 200, width 200, and depth 1). Then, say I'd
like the next layer to be a convolutional layer with 16 filters, each with a width and height of
2. When performing the convolution, I'd like the filter to jump two pixels at a time. I also don't
want the filter to extend outside of the image boundaries; in other words, I don't want to pad
the image with zeros. Then, to construct this convolutional layer, I would use the following
line of code:
Say I'd like the next layer in my CNN to be a convolutional layer that takes the layer
constructed in Example 1 as input. Say I'd like my new layer to have 32 filters, each with a
height and width of 3. When performing the convolution, I'd like the filter to jump 1 pixel at a
time. I want the convolutional layer to see all regions of the previous layer, and so I don't mind
if the filter hangs over the edge of the previous layer when it's performing the convolution.
Then, to construct this convolutional layer, I would use the following line of code:
If you look up code online, it is also common to see convolutional layers in Keras in this
format:
Image
source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
Dimensionality
Just as with neural networks, we create a CNN in Keras by first creating
a Sequential model.
Copy and paste the following code into a Python executable named conv-dims.py :
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=2, strides=2, padding='valid',
activation='relu', input_shape=(200, 200, 1)))
model.summary()
We will not train this CNN; instead, we'll use the executable to study how the dimensionality
of the convolutional layer changes, as a function of the supplied arguments.
Run python path/to/conv-dims.py and look at the output. It should appear as follows:
Feel free to change the values assigned to the arguments ( filters , kernel_size , etc) in
your conv-dims.py file.
Take note of how the number of parameters in the convolutional layer changes. This
corresponds to the value under Param # in the printed output. In the figure above, the
convolutional layer has 80 parameters.
Also notice how the shape of the convolutional layer changes. This corresponds to the value
under Output Shape in the printed output. In the figure above, None corresponds to the
batch size, and the convolutional layer has a height of 100 , width of 100 , and depth of 16 .
Notice that K = filters , and F = kernel_size . Likewise, D_in is the last value in
the input_shape tuple.
Since there are F*F*D_in weights per filter, and the convolutional layer is composed
of K filters, the total number of weights in the convolutional layer is K*F*F*D_in . Since there
is one bias term per filter, the convolutional layer has K biases. Thus, the _ number of
parameters_ in the convolutional layer is given by K*F*F*D_in + K .
Formula: Shape of a Convolutional Layer
The depth of the convolutional layer will always equal the number of filters K .
If padding = 'same' , then the spatial dimensions of the convolutional layer are the
following:
If padding = 'valid' , then the spatial dimensions of the convolutional layer are the
following:
Quiz
Please change the conv-dims.py file, so that it appears as follows:
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=3, strides=2, padding='same',
activation='relu', input_shape=(128, 128, 3)))
model.summary()
Run python path/to/conv-dims.py , and use the output to answer the questions below.
How many parameters does the convolutional layer have?
902
306
896
1034
SOLUTION:896
What is the depth of the convolutional layer?
3
16
32
64
SOLUTION:32
What is the width of the convolutional layer?
3
16
32
64
SOLUTION:64
Optional Resource
Image source: http://cs231n.github.io/convolutional-networks/
pool_size - Number specifying the height and width of the pooling window.
There are some additional, optional arguments that you might like to tune:
strides - The vertical and horizontal stride. If you don't specify anything, strides will
default to pool_size .
padding - One of 'valid' or 'same' . If you don't specify anything, padding is set
to 'valid' .
NOTE: It is possible to represent both pool_size and strides as either a number or a
tuple.
Say I'm constructing a CNN, and I'd like to reduce the dimensionality of a convolutional layer
by following it with a max pooling layer. Say the convolutional layer has size (100, 100,
15) , and I'd like the max pooling layer to have size (50, 50, 15) . I can do this by using a
2x2 window in my max pooling layer, with a stride of 2, which could be constructed in the
following line of code:
MaxPooling2D(pool_size=2, strides=2)
If you'd instead like to use a stride of 1, but still keep the size of the window at 2x2, then you'd
use:
MaxPooling2D(pool_size=2, strides=1)
Checking the Dimensionality of Max Pooling Layers
Copy and paste the following code into a Python executable named pool-dims.py :
model = Sequential()
model.add(MaxPooling2D(pool_size=2, strides=2, input_shape=(100, 100, 15)))
model.summary()
Run python path/to/pool-dims.py and look at the output. It should appear as follows:
Feel free to change the arguments in your pool-dims.py file, and check how the shape of the
max pooling layer changes.
Need to understand
Just as with neural networks, we create a CNN in Keras by first creating
a Sequential model.
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=2, padding='same', activation='relu',
input_shape=(32, 32, 3)))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Flatten())
model.add(Dense(500, activation='relu'))
model.add(Dense(10, activation='softmax'))
The network begins with a sequence of three convolutional layers, followed by max
pooling layers. These first six layers are designed to take the input array of image
pixels and convert it to an array where all of the spatial information has been
squeezed out, and only information encoding the content of the image remains. The
array is then flattened to a vector in the seventh layer of the CNN. It is followed by
two dense layers designed to further elucidate the content of the image. The final
layer has one entry for each object class in the dataset, and has a softmax activation
function, so that it returns probabilities.
NOTE: In the video, you might notice that convolutional layers are specified
with Convolution2D instead of Conv2D . Either is fine for Keras 2.0, but Conv2D is
preferred.
Things to Remember
Always add a ReLU activation function to the Conv2D layers in your CNN. With
the exception of the final layer in the network, Dense layers should also have a
ReLU activation function.
When constructing a network for classification, the final layer in the network
should be a Dense layer with a softmax activation function. The number of
nodes in the final layer should equal the total number of classes in the dataset.
Have fun! If you start to feel discouraged, we recommend that you check
out Andrej Karpathy's tumblr with user-submitted loss functions,
corresponding to models that gave their owners some trouble. Recall that the
loss is supposed to decrease during training. These plots show very different
behavior :).
19. Mini Project: CNNs in Keras
Mini Project: CNNs in Keras
In this mini project, you'll modify the architecture of the neural network
in cifar10_cnn.ipynb. Starter code is provided on Github.
Instructions
QUESTION:
Specify a new CNN architecture in Step 5: Define the Model Architecture in the
notebook. If you need more inspiration, check out this link.
Train your new model. Once you've finished, check the accuracy on the test dataset,
and report the percentage in the text box below.
Feel free to amend other parts of the notebook: for instance, what happens when you
use a different optimizer?
ANSWER:
Thank you for completing the mini project!
udacimak v1.1.3
Instructions
QUESTION:
Once you've changed the settings, train the model. Check the accuracy on the test
dataset, and report the percentage in the text box below.
ANSWER:
Thank you for completing the mini project!
udacimak v1.1.3
Optional Resources
If you would like to know more about interpreting CNNs and convolutional layers in
particular, you are encouraged to check out these resources:
The CNN we will look at is trained on ImageNet as described in this paper by Zeiler and
Fergus. In the images below (from the same paper), we’ll see what each layer in this network
detects and see how each layer detects more and more complex ideas.
Example patterns that cause activations in the first layer of the network. These range from
simple diagonal lines (top left) to green blobs (bottom middle).
The images above are from Matthew Zeiler and Rob Fergus' deep visualization toolbox, which
lets us visualize what each layer in a CNN focuses on.
Each image in the above grid represents a pattern that causes the neurons in the first layer to
activate - in other words, they are patterns that the first layer recognizes. The top left image
shows a -45 degree line, while the middle top square shows a +45 degree line. These squares
are shown below again for reference.
As visualized here, the first layer of the CNN can recognize -45 degree lines.
The first layer of the CNN is also able to recognize +45 degree lines, like the one above.
Let's now see some example images that cause such activations. The below grid of images all
activated the -45 degree line. Notice how they are all selected despite the fact that they have
different colors, gradients, and patterns.
Example patches that activate the -45 degree line detector in the first layer.
So, the first layer of our CNN clearly picks out very simple shapes and patterns like lines and
blobs.
Layer 2
25. Transfer Learning
Transfer Learning
Play
00:00
04:21
Disable captions
Settings
Enter fullscreen
Play
Transfer Learning
Transfer learning involves taking a pre-trained neural network and adapting the neural
network to a new, different data set.
Depending on both:
the approach for using transfer learning will be different. There are four main cases:
1. new data set is small, new data is similar to original training data
2. new data set is small, new data is different from original training data
3. new data set is large, new data is similar to original training data
4. new data set is large, new data is different from original training data
26. Transfer Learning in Keras
Transfer Learning in Keras
Play
00:24
04:56
Disable captions
Settings
Enter fullscreen
Play
The Jupyter notebook described in the video can be accessed from the aind2-cnn GitHub
repository linked here. Navigate to the transfer-learning/ folder and
open transfer_learning.ipynb. If you'd like to learn how to calculate your own bottleneck
features, look at bottleneck_features.ipynb. (You may have trouble
running bottleneck_features.ipynb on an AWS GPU instance - if so, feel free to use the
notebook on your local CPU/GPU instead!)
Optional Resources
Here's the first research paper to propose GAP layers for object localization.
Check out this repository that uses a CNN for object localization.
Watch this video demonstration of object localization with a CNN.
Check out this repository that uses visualization techniques to better understand bottleneck
features
01. Convolutional Layers
Convolutional Layers
The convolution for each 3x3 section is calculated against the weight, [[1, 0, 1], [0, 1,
0], [1, 0, 1]] , and then a bias is added to create the convolved feature on the right. In this
case, the bias is zero.
# output depth
k_output = 64
# image dimensions
image_width = 10
image_height = 10
color_channels = 3
# input/image
input = tf.placeholder(
tf.float32,
shape=[None, image_height, image_width, color_channels])
# apply convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# add bias
conv_layer = tf.nn.bias_add(conv_layer, bias)
# apply activation function
conv_layer = tf.nn.relu(conv_layer)
The code above uses the tf.nn.conv2d() function to compute the convolution
with weight as the filter and [1, 2, 2, 1] for the strides.
The tf.nn.bias_add() function adds a 1-d bias to the last dimension in a matrix. (Note:
using tf.add() doesn't work when the tensors aren't the same shape.)
The tf.nn.relu() function applies a ReLU activation function to the layer.
02. Quiz: Convolutional Layers
Using Convolutional Layers in TensorFlow
Let's now build a convolutional layer in TensorFlow. In the below exercise, you'll be asked to
set up the dimensions of the convolution filters, the weights, and the biases. This is in many
ways the trickiest part to using CNNs in TensorFlow. Once you have a sense of how to set up
the dimensions of these attributes, applying CNNs will be far more straightforward.
Review
You should go over the TensorFlow documentation for 2D convolutions. Most of the
documentation is straightforward, except perhaps the padding argument. The padding might
differ depending on whether you pass 'VALID' or 'SAME' .
Instructions
Start Quiz:
conv.py
"""
Setup the strides, padding and filter weight/bias such that
the output shape is (1, 2, 2, 3).
"""
import tensorflow as tf
import numpy as np
def conv2d(input):
# Filter (weights and bias)
# The shape of the filter weight is (height, width, input_depth,
output_depth)
# The shape of the filter bias is (output_depth,)
# TODO: Define the filter weights `F_W` and filter bias `F_b`.
# NOTE: Remember to wrap them in `tf.Variable`, they are
trainable parameters after all.
F_W = ?
F_b = ?
# TODO: Set the stride for each dimension (batch_size, height,
width, depth)
strides = [?, ?, ?, ?]
# TODO: set the padding, either 'VALID' or 'SAME'.
padding = ?
#
https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#con
v2d
# `tf.nn.conv2d` does not include the bias computation so we have
to add it ourselves after.
return tf.nn.conv2d(input, F_W, strides, padding) + F_b
out = conv2d(X)
03. Solution: Convolutional Layers
Solution
Here's how I did it. NOTE: there is more than one way to get the correct output shape.
Your answer might differ from mine.
def conv2d(input):
# create the filter (weights and bias)
F_W = tf.Variable(tf.truncated_normal((2, 2, 1, 3)))
F_b = tf.Variable(tf.zeros(3))
strides = [1, 2, 2, 1]
padding = 'VALID'
x = tf.nn.conv2d(input, F_W, strides, padding)
return tf.nn.bias_add(x, F_b)
I want to transform the input shape (1, 4, 4, 1) to (1, 2, 2, 3) . I
choose 'VALID' for the padding algorithm. I find it simpler to understand, and it
achieves the result I'm looking for.
The image below is an example of max pooling with a 2x2 filter and stride of 2. The four 2x2
colors represent each time the filter was applied to find the maximum value.
For example, [[1, 0], [4, 6]] becomes 6 , because 6 is the maximum value in this set.
Similarly, [[2, 3], [6, 8]] becomes 8 .
Conceptually, the benefit of the max pooling operation is to reduce the size of the input, and
allow the neural network to focus on only the most important elements. Max pooling does this
by only retaining the maximum value for each filtered area, and removing the remaining
values.
...
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.bias_add(conv_layer, bias)
conv_layer = tf.nn.relu(conv_layer)
# apply max pooling
conv_layer = tf.nn.max_pool(
conv_layer,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
padding='SAME')
The tf.nn.max_pool() function performs max pooling with the ksize parameter as the size
of the filter and the strides parameter as the length of the stride. 2x2 filters with a stride of
2x2 are common in practice.
The ksize and strides parameters are structured as 4-element lists, with each element
corresponding to a dimension of the input tensor ( [batch, height, width, channels] ).
For both ksize and strides , the batch and channel dimensions are typically set to 1 .
05. Quiz: Max Pooling Layers
Using Max Pooling Layers in TensorFlow
In the below exercise, you'll be asked to set up the dimensions of the pooling filters, strides, as
well as the appropriate padding.
Review
You should go over the TensorFlow documentation for tf.nn.max_pool() . Padding works
the same as it does for a convolution.
Instructions
2. Setup the strides , padding and ksize such that the output shape after pooling
is (1, 2, 2, 1) .
Start Quiz:
pool.py
"""
Set the values to `strides` and `ksize` such that
the output shape after pooling is (1, 2, 2, 1).
"""
import tensorflow as tf
import numpy as np
def maxpool(input):
# TODO: Set the ksize (filter size) for each dimension
(batch_size, height, width, depth)
ksize = [?, ?, ?, ?]
# TODO: Set the stride for each dimension (batch_size, height,
width, depth)
strides = [?, ?, ?, ?]
# TODO: set the padding, either 'VALID' or 'SAME'.
padding = ?
#
https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#max
_pool
return tf.nn.max_pool(input, ksize, strides, padding)
out = maxpool(X)
Here's how I did it. NOTE: there is more than one way to get the correct output shape.
Your answer might differ from mine.
def maxpool(input):
ksize = [1, 2, 2, 1]
strides = [1, 2, 2, 1]
padding = 'VALID'
return tf.nn.max_pool(input, ksize, strides, padding)
I want to transform the input shape (1, 4, 4, 1) to (1, 2, 2, 1) . I
choose 'VALID' for the padding algorithm. I find it simpler to understand and it
achieves the result I'm looking for.
There are many wonderful free resources that allow you to go into more depth around
Convolutional Neural Networks. In this course, our goal is to give you just enough intuition to
start applying this concept on real world problems so you have enough of an exposure to
explore more on your own. We strongly encourage you to explore some of these resources
more to reinforce your intuition and explore different ideas.
Back to Home
In this lesson, we'll be covering weight initialization. You'll learn how to find good initial
weights for a neural network. Having good initial weights can place the neural network close
to the optimal solution. This allows the neural network to come to the best solution quicker.
You can get the notebook files from our public GitHub repo, in the weight-
initialization folder. Download the files from Github, or clone the repo
Play
00:00
06:31
Disable captions
Settings
Enter fullscreen
04. Too Small
Weight Initialization 3
05. Normal Distribution
Weight Initialization 4
06. Additional Material
Additional Material
New techniques for dealing with weights are discovered every few years. We've provided the
most popular papers in this field over the years.
Autoencoders
Back to Home
Autoencoders
In this lesson we're covering autoencoders. These models are used to compress data, as well as
image denoising, which you'll be implementing in this lesson. The idea here is we'll build a
network that tries to generate it's input data, but with a narrow hidden layer that serves as a
compressed representation of the input data.
As a heads up, the lesson structure will be a bit different than you've seen before. Here I'll be
walking you through implementing autoencoders in a Jupyter Notebook. You can find the
notebooks for this lesson in our public GitHub repo, in the autoencoder directory. Do a git
pull to get the most recent files!
Transfer Learning
In this lesson you'll be learning about transfer learning. In practice, you won't typically be
training your own huge networks. There are multiple models out there that have been trained
for weeks on huge datasets like ImageNet. In this lesson, you'll be using one of these
pretrained networks, VGGNet, to classify images of flowers.
cd transfer-learning
git clone https://github.com/machrisaa/tensorflow-vgg.git tensorflow_vgg
Additional Packages
To run this code, you'll need a few packages you might not have installed in your environment.
You'll need all the normal ones, plus tqdm and scikit-image . To install them, use Conda as
usual:
Play
00:00
01:06
Disable captions
Settings
Enter fullscreen
08. Classifier Solution
Building The Classifier
09. Training
Training The Classifier
10. Training solution
Training And Testing
01. CNN Project
Project Overview
Welcome to the Convolutional Neural Networks (CNN) project! In this project, you will learn
how to build a pipeline to process real-world, user-supplied images. Given an image of a dog,
your algorithm will identify an estimate of the canine’s breed. If supplied an image of a
human, the code will identify the resembling dog breed.
Along with exploring state-of-the-art CNN models for classification, you will make important
design decisions about the user experience for your app. Our goal is that by completing this
lab, you understand the challenges involved in piecing together a series of models designed to
perform various tasks in a data processing pipeline. Each model has its strengths and
weaknesses, and engineering a real-world application often involves solving many problems
without a perfect answer. Your imperfect solution will nonetheless create a fun user
experience!
Project Instructions
Clone the project from the GitHub repository. Follow the instructions in the README to
complete the project.
Evaluation
Your project will be reviewed by a Udacity reviewer against the CNN project rubric. Review
this rubric thoroughly, and self-evaluate your project before submission. All criteria found in
the rubric must meet specifications for you to pass.
Project Submission
When you are ready to submit your project, collect the following files and compress them into
a single archive for upload:
The dog_app.ipynb file with fully functional code, all code cells executed and displaying
output, and all questions answered.
An HTML or PDF export of the project notebook with the
name report.html or report.pdf .
Any additional images used for the project that were not supplied to you for the
project. Please do not include the project data sets in the dogImages/ or lfw/ folders.
Likewise, please do not include the bottleneck_features/ folder.
Alternatively, your submission could consist of the GitHub link to your repository.
Click on the "Submit Project" button and follow the instructions to submit!
02. Dog Breed Workspace
Workspace
This section contains either a workspace (it can be a Jupyter Notebook workspace or an online
code editor work space, etc.) and it cannot be automatically downloaded to be generated here.
Please access the classroom with your account and manually download the workspace to your
local machine. Note that for some courses, Udacity upload the workspace files
onto https://github.com/udacity, so you may be able to download them there.
Workspace Information:
Back to Home
01. Intro
02. Skin Cancer
03. Survival Probability of Skin Cancer
04. Medical Classification
05. The data
06. Image Challenges
07. Quiz: Data Challenges
08. Solution: Data Challenges
09. Training the Neural Network
10. Quiz: Random vs Pre-initialized Weights
11. Solution: Random vs Pre-initialized Weight
12. Validating the Training
13. Quiz: Sensitivity and Specificity
14. Solution: Sensitivity and Specificity
15. More on Sensitivity and Specificity
16. Quiz: Diagnosing Cancer
17. Solution: Diagnosing Cancer
18. Refresh on ROC Curves
19. Quiz: ROC Curve
20. Solution: ROC Curve
21. Comparing our Results with Doctors
22. Visualization
23. What is the network looking at?
24. Refresh on Confusion Matrices
25. Confusion Matrix
26. Conclusion
27. Useful Resources
28. Mini Project Introduction
29. Mini Project: Dermatologist AI
06. Image Challenges
06 Image Challenge V3
Play
00:09
00:38
Disable captions
Settings
Enter fullscreen
Play
Challenge
Looking at the following images, could you tell the characteristics that determine if a lesion is
benign (above) or malignant (below)?
10. Quiz: Random vs Pre-initialized Weights
10 Quiz Random Vs Preinitiliazed Weights V3
Play
00:00
-00:31
Mute
Disable captions
Settings
Enter fullscreen
Play
Training Network Quiz
Do you think that pretraining the network with completely different images like cats, dogs,
horses, and cars, made cancer classification better, the same, or worse?
Better
The same
Worse
SOLUTION:Better
13. Quiz: Sensitivity and Specificity
13 Quiz Sensitivity And Specificty V3
Play
00:00
-00:16
Mute
Disable captions
Settings
Enter fullscreen
Play
For the following quiz, you'll need to Google the definitions of sensitivity and specificity. If
you'd like a refresher on precision and recall, here is one resource.
Specificity is Precision
Sensitivity is Recall
Specificity is Recall and Precision is one minus the Sensitivity
Oh goodness, just tell me.
SOLUTION:Sensitivity is Recall
15. More on Sensitivity and Specificity
Sensitivity and Specificity
Although similar, sensitivity and specificity are not the same as precision and recall. Here are
the definitions:
Recall: Of all the people who have cancer, how many did we diagnose as having cancer?
Precision: Of all the people we diagnosed with cancer, how many actually had cancer?
From here we can see that Sensitivity is Recall, and the other two are not the same thing.
Trust me, we also have a hard time remembering which one is which, so here's a little trick. If
you remember from Luis's Evaluation Metrics section, here is the [confusion matrix]
(- Confusion Matrix:
16. Quiz: Diagnosing Cancer
15 Quiz Diagnosing Cancer V3
Play
00:01
-00:20
Mute
Disable captions
Settings
Enter fullscreen
Play
Diagnosing Quiz
Say we have a neural network that outputs the probability of a melanoma. What value would
you choose as a threshold to classify this as melanoma or no melanoma?
0.2
0.5
0.8
SOLUTION:0.2
17. Solution: Diagnosing Cancer
16 Solution Diagnosing Cancer V3
Play
00:02
01:01
Disable captions
Settings
Enter fullscreen
Play
The graph below is a histogram of the predictions our model gives in a set of images
of lesions, as follows:
ROC Curve
Play
00:00
05:26
Disable captions
Settings
Enter fullscreen
Play
The curves have been introduced as follows, where in the horizontal axis we plot the
True Positive Rate, and in the vertical axis we plot the False Positive Rate.
19. Quiz: ROC Curve
17 Quiz ROC Curve 1 PT2 V1
Play
00:01
00:27
Disable captions
Settings
Enter fullscreen
Play
24. Refresh on Confusion Matrices
Confusion Matrices
In Luis's Evaluation Metrics section, we learned about confusion matrices, and if you
need a refresher, the video is below.
Confusion Matrix-Question 1
Play
00:00
-03:04
Mute
Disable captions
Settings
Enter fullscreen
Play
Type 1 and Type 2 Errors
Sometimes in the literature, you'll see False Positives and True Negatives as Type 1
and Type 2 errors. Here is the correspondence:
Type 1 Error (Error of the first kind, or False Positive): In the medical example, this
is when we misdiagnose a healthy patient as sick.
Type 2 Error (Error of the second kind, or False Negative): In the medical example,
this is when we misdiagnose a sick patient as healthy.
The data and objective are pulled from the 2017 ISIC Challenge on Skin Lesion Analysis
Towards Melanoma Detection. As part of the challenge, participants were tasked to design an
algorithm to diagnose skin lesion images as one of three different skin diseases (melanoma,
nevus, or seborrheic keratosis). In this project, you will create a model to generate your own
predictions.
Getting Started
1. Clone the repository and create a data/ folder to hold the dataset of skin images.
6. Place the training, validation, and test images in the data/ folder,
at data/train/ , data/valid/ , and data/test/ , respectively. Each folder should
contain three sub-folders ( melanoma/ , nevus/ , seborrheic_keratosis/ ), each
containing representative images from one of the three image classes.
You are free to use any coding environment of your choice to solve this mini project! In order
to rank your results, you need only use a pipeline that culminates in a CSV file containing your
test predictions.
Create a Model
Use the training and validation data to train a model that can distinguish between the three
different image classes. (After training, you will use the test images to gauge the performance
of your model.)
If you would like to read more about some of the algorithms that were successful in this
competition, please read this article that discusses some of the best approaches. A few of the
corresponding research papers appear below.
While the original challenge provided additional data (such as the gender and age of the
patients), we only provide the image data to you. If you would like to download this additional
patient data, you may do so at the competition website.
All three of the above teams increased the number of images in the training set with additional
data sources. If you'd like to expand your training set, you are encouraged to begin with
the ISIC Archive.
Evaluation
Inspired by the ISIC challenge, your algorithm will be ranked according to three separate
categories.
If you are unfamiliar with ROC (Receiver Operating Characteristic) curves and would like to
learn more, you can check out the documentation in scikit-learn or read this Wikipedia article.
The top scores (from the ISIC competition) in this category can be found in the image below.
Category 2: ROC AUC for Melanocytic Classification
All of the skin lesions that we will examine are caused by abnormal growth of
either melanocytes or keratinocytes, which are two different types of epidermal skin cells.
Melanomas and nevi are derived from melanocytes, whereas seborrheic keratoses are derived
from keratinocytes.
In the second category, we will test the ability of your CNN to distinuish between melanocytic
and keratinocytic skin lesions by calculating the area under the receiver operating
characteristic curve (ROC AUC) corresponding to this binary classification task.
The top scores in this category (from the ISIC competition) can be found in the image below.
Category 3: Mean ROC AUC
In the third category, we will take the average of the ROC AUC values from the first two
categories.
The top scores in this category (from the ISIC competition) can be found in the image below.