Introduction to deep
learning with
PyTorch
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Maham Faisal Khan
Senior Data Scientist
What is deep learning?
Deep learning is everywhere:
Language translation
Self-driving cars
Medical diagnostics
Chatbots
Traditional machine learning: relies on
hand-crafted feature engineering
Deep learning: enables feature learning
from raw data
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
What is deep learning?
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
What is deep learning?
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
What is deep learning?
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
What is deep learning?
Inspired by connections in the human brain
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
What is deep learning?
Inspired by connections in the human brain
Neurons ➡ neural networks
Models require large amount of data
At least 100,000s data points
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
PyTorch: a deep learning framework
PyTorch is
one of the most popular deep learning
frameworks
the framework used in many published
deep learning papers
intuitive and user-friendly
has much in common with NumPy
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Importing PyTorch and related packages
PyTorch import in Python
import torch
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Tensors: the building blocks of networks in PyTorch
Load from list Load from NumPy array
import torch np_array = np.array(array)
lst = [[1, 2, 3], [4, 5, 6]] np_tensor = torch.from_numpy(np_array)
tensor = torch.tensor(lst)
Like NumPy arrays, tensors are
Tensor: multidimensional representations of their
Similar to array elements
Building block of neural networks
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Tensor attributes
Tensor shape Tensor device
lst = [[1, 2, 3], [4, 5, 6]]
tensor.device
tensor = torch.tensor(lst)
tensor.shape
device(type='cpu')
torch.Size([2, 3])
Deep learning often requires a GPU, which,
Tensor data type compared to a CPU can offer:
tensor.dtype parallel computing capabilities
faster training times
torch.int64
better performance
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting started with tensor operations
Compatible shapes Incompatible shapes
a = torch.tensor([[1, 1], a = torch.tensor([[1, 1],
[2, 2]]) [2, 2]])
b = torch.tensor([[2, 2], c = torch.tensor([[2, 2, 4],
[3, 3]]) [3, 3, 5]])
Addition / subtraction Addition / subtraction
a + b a + c
tensor([[3, 3], RuntimeError: The size of tensor a
[5, 5]]) (2) must match the size of tensor b (3)
at non-singleton dimension 1
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting started with tensor operations
Element-wise multiplication ... and much more
Transposition
a = torch.tensor([[1, 1],
[2, 2]])
Matrix multiplication
b = torch.tensor([[2, 2], Concatenation
[3, 3]])
Most NumPy array operations can be
a * b
performed on PyTorch tensors
tensor([[2, 2],
[6, 6]])
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Creating our first
neural network
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Maham Faisal Khan
Senior Data Scientist
Our first neural network
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our first neural network
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our first neural network
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our first neural network
import torch.nn as nn
## Create input_tensor with three features
input_tensor = torch.tensor(
[[0.3471, 0.4547, -0.2356]]
)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our first neural network
import torch.nn as nn
## Create input_tensor with three features
input_tensor = torch.tensor(
[[0.3471, 0.4547, -0.2356]])
A linear layer takes an input, applies a linear
function, and returns output
# Define our first linear layer
linear_layer = nn.Linear(
in_features=3,
out_features=2
)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our first neural network
import torch.nn as nn
## Create input_tensor with three features
input_tensor = torch.tensor(
[[0.3471, 0.4547, -0.2356]])
# Define our first linear layer
linear_layer = nn.Linear(in_features=3, out_features=2)
# Pass input through linear layer
output = linear_layer(input_tensor)
print(output)
tensor([[-0.2415, -0.1604]],
grad_fn=<AddmmBackward0>)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our first neural network
import torch.nn as nn
## Create input_tensor with three feature
input_tensor = torch.tensor(
[[0.3471, 0.4547, -0.2356]])
# Define our first linear layer
linear_layer = nn.Linear(in_features=3, o
# Pass input through linear layer
output = linear_layer(input_tensor)
print(output)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting to know the linear layer operation
Each linear layer has a .weight and .bias property
linear_layer.weight linear_layer.bias
Parameter containing: Parameter containing:
tensor([[-0.4799, 0.4996, 0.1123], tensor([0.0310, 0.1537], requires_grad=True)
[-0.0365, -0.1855, 0.0432]],
requires_grad=True)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting to know the linear layer operation
output = linear_layer(input_tensor)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting to know the linear layer operation
output = linear_layer(input_tensor)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting to know the linear layer operation
output = linear_layer(input_tensor)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting to know the linear layer operation
output = linear_layer(input_tensor)
For input X , weights W0 and bias b0 , the linear layer performs
y0 = W0 ⋅ X + b0
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Our two-layer network summary
Input dimensions: 1 × 3
Linear layer arguments:
in_features = 3
out_features = 2
Output dimensions: 1 × 2
Networks with only linear layers are called
fully connected
Each neuron in one layer is connected to
each neuron in the next layer
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Stacking layers with nn.Sequential()
A neural network with multiple layers
# Create network with three linear layers
model = nn.Sequential(
nn.Linear(10, 18),
nn.Linear(18, 20),
nn.Linear(20, 5)
)
Input is passed through the linear layers
Input 10 ➡ output 18 ➡ output 20 ➡ Output 5
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Stacking layers with nn.Sequential()
print(input_tensor)
tensor([[-0.0014, 0.4038, 1.0305, 0.7521, 0.7489, -0.3968, 0.0113, -1.3844, 0.8705, -0.9743]])
# Pass input_tensor to model to obtain output
output_tensor = model(input_tensor)
print(output_tensor)
tensor([[-0.0254, -0.0673, 0.0763,
0.0008, 0.2561]], grad_fn=<AddmmBackward0>)
Output is still not yet meaningful
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Discovering
activation functions
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Maham Faisal Khan
Senior Data Scientist
Stacked linear operations
We have only seen linear layer networks
Each linear layer multiplies its respective
input with layer weights and adds biases
Even with multiple stacked linear layers,
output still has linear relationship with input
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Why do we need activation functions?
Activation functions add non-linearity to
the network
A model can learn more complex
relationships with non-linearity
"Pre-activation" output passed to the
activation function
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Meet the sigmoid function
Mammal or not?
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Meet the sigmoid function
Binary classification task:
To predict whether animal is 1 (mammal) or
0 (not mammal)
Input:
Limbs: 4
Eggs: 0
Hair: 1
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Meet the sigmoid function
Binary classification task:
To predict whether animal is 1 (mammal) or
0 (not mammal)
Input:
Limbs: 4
Eggs: 0
Hair: 1
output to the linear layers is 6
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Meet the sigmoid function
Binary classification task:
To predict whether animal is 1 (mammal) or
0 (not mammal),
pre-activation is 6,
pass it to the sigmoid,
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Meet the sigmoid function
Binary classification task:
To predict whether animal is 1 (mammal) or
0 (not mammal),
we take the pre-activation (6),
pass it to the sigmoid,
and obtain a value between 0 and 1.
Using the common threshold of 0.5:
If output is > 0.5, class label = 1 (mammal)
If output is <= 0.5, class label = 0 (not
mammal)
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Meet the sigmoid function
import torch
import torch.nn as nn
input_tensor = torch.tensor([[6.0]])
sigmoid = nn.Sigmoid()
output = sigmoid(input_tensor)
tensor([[0.9975]])
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Activation function as the last layer
model = nn.Sequential(
nn.Linear(6, 4), # First linear layer
nn.Linear(4, 1), # Second linear layer
nn.Sigmoid() # Sigmoid activation function
)
Note. Sigmoid as last step in network of linear layers is equivalent to traditional logistic
regression.
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
SIGMOID for BINARY classification problems
SOFTMAX for MULTI-CLASS classification
N=3 classes:
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
SIGMOID for BINARY classification problems
SOFTMAX for MULTI-CLASS classification
N=3 classes:
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
SIGMOID for BINARY classification problems
SOFTMAX for MULTI-CLASS classification
N=3 classes:
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
SIGMOID for BINARY classification problems
SOFTMAX for MULTI-CLASS classification
N=3 classes:
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
takes N-element vector as input and
outputs vector of same size
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
takes N-element vector as input and
outputs vector of same size
outputs a probability distribution:
each element is a probability (it's
bounded between 0 and 1)
the sum of the output vector is equal to 1
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
takes N-element vector as input and
outputs vector of same size
outputs a probability distribution:
each element is a probability (it's
bounded between 0 and 1)
the sum of the output vector is equal to 1
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Getting acquainted with softmax
import torch dim = -1 indicates softmax is applied to
import torch.nn as nn the input tensor's last dimension
nn.Softmax() can be used as last step in
# Create an input tensor
input_tensor = torch.tensor( nn.Sequential()
[[4.3, 6.1, 2.3]])
# Apply softmax along the last dimension
probabilities = nn.Softmax(dim=-1)
output_tensor = probabilities(input_tensor)
print(output_tensor)
tensor([[0.1392, 0.8420, 0.0188]])
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH