0% found this document useful (0 votes)

8 views20 pages

Day1 05 Introduction To DeepLearning Part

The document is an introduction to deep learning, covering topics such as the modeling of neurons, perceptrons, multi-layered perceptrons, and various types of artificial neural networks (ANN). It discusses the structure and function of neurons, activation functions, and training methods for neural networks. Additionally, it highlights popular frameworks and categories of neural networks used in applications like image recognition and speech recognition.

Uploaded by

Trần Văn Định

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views20 pages

Day1 05 Introduction To DeepLearning Part

Uploaded by

Trần Văn Định

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

1/19/2025

Introduction to Deep Learning

2024

Ando Ki, Ph.D.

[email protected]

Table of contents
 Modeling a neuron  Artificial neuron: Perceptron
 Artificial neuron: activation functions
 Perceptron
 Artificial neural network: ANN
 How perceptron classifies hyperplane
 Fully connected feed-forward network: FC-FFN
 Perceptron: Boolean  Optional output layer: Softmax
 Perceptron: Boolean AND training  How to find a good or the best network: Loss/Cost
 Multi-layered perceptron  How to find a good or the best network: Total Lost
 How to minimize total loss by changing [W] and [b]
 Layer-wise organization
 Optimization algorithm: gradient descent
 Categories of ANN
 How to compute gradient
 Brief history of neural network  Neural network
 Popular frameworks  Popular types of neural network
 Deep neural net
 NN categories by applications

Copyright (c) by Ando Ki 2

1
1/19/2025

Modeling a neuron
 Neuron: 신경세포(神經細胞)
► Dendrite: 수상돌기(樹狀突起)
 input
► Axon: 축삭돌기(軸索突起)
 output
 Branches of axon
 Terminals of axon (axon tip)
⚫ synaptic knob
► Synapse: 연접
 junction between two nerve cells

 Human
► whole brain
 ~86 billion neurons (Giga, 109)
 ~100 trillion synapses (Tera, 1012)
► cerebral cortex: 대뇌피질
 19~23 billion neurons

https://www.quora.com/What-is-deep-learning
Copyright (c) by Ando Ki 3

Modeling a neuron
https://en.wikipedia.org/wiki/Activation_function
 Activation functions

sigmoid
sigmoid

Copyright (c) by Ando Ki 4

2
1/19/2025

Perceptron: single layer neural network

 Perceptron is a single artificial neuron that
bias
computes its weighted input and uses a b

threshold activation function.

► It is also called a TLU (threshold logic unit).
► It effectively separates the input space into
two categories by the hyperplane: W*X+b =
0

► Perceptron is a linear classifier.

 Cannot deal with non-linear cases
► Perceptron refers to a particular supervised
learning model with backpropagation
learning algorithm.
► Perceptron is an algorithm for supervised
learning of binary classifiers.

Copyright (c) by Ando Ki 5

How perceptron classifies hyperplane

t
X W t Y
Y=a Y=b

X1 W1 X2

t Y Y=b

X2 W2
Y=a
X1

W1*x1+W2*x2 + W0 = 0
➔ y = ax + b

X1 W1 X3

X2 W2 t Y

X3 W3
X2

X1
Need multi-layer perceptron
W1*x1+W2*x2 + W3*x3 + W0 = 0
➔ z = ax + by + c
Copyright (c) by Ando Ki 6

3
1/19/2025

Perceptron: Boolean

X1 W1=1 X1 W1=1 X1 W1=1

t=1.5 Y t=0.5 Y
X1 W1=-1 t=-0.5 Y t=1.5 Y

X2 W2=1 X2 W2=1 X2 W2=1

AND OR NOT XOR

Y X2
X2 X2

1 1 Y=1 Y=0
1 Y=0 Y=1 1 Y=1 Y=1

0 X1 0 X1
0 X1 0 X1
0 1 0 1
0 1 0 1

Copyright (c) by Ando Ki 7

Perceptron: Boolean AND training

 Step 1: initialize the weight and the  Training set [{inputs: expected}]
► T0={0,0:0}, T1={0,1:0}, T2={1,0:0}, T3={1,1:1}
threshold.
 for T0 and T1 and T2 (assume all weights are 0)
► Weights may be initialized to 0 or to a small ► y = 0x0+0x0 = 0
random value. ► e = 0-0 = 0 (no error)
► No update since no error
 Step 2: repeat until error is less than a
specific value  for T3
► y = 1x0+1x0=0
► Calculate output (for j-th test set) ► e = 1-0 = 1
► w0 = 0 + (1-0) = 1
► w1 = 0 + (1-0) = 1
 After updating
► Update weights (for i-th path for j-th test set) ► for T3, T2, and T1
(dj is desired or expected value)  y = 1x1+1x1=2 => apply threshold = 1.5
⚫ e = 1-1 = 0
 y = 1x1+1x0=1 => apply threshold = 1.5
► Calculate error ⚫ e = 0-0 = 0
 y = 1x0+1x1=1 => apply threshold = 1.5
⚫ e = 0-0 = 0
 y = 1x0+1x0=0 => apply threshold = 1.5
⚫ e = 0-0 = 0

Copyright (c) by Ando Ki 8

4
1/19/2025

Perceptron: Boolean OR training

 Training set [{inputs: expected}]  for T3
► T0={0,0:0}, T1={0,1:1}, T2={1,0:1}, T3={1,1:1} ► y = 1x1+1x1=2 ==> apply threshold = 1
 for T0 (assume all weights are 0) ► e = 1-1 = 0
► y = 0x0+0x0 = 0 ► No update since no error
► e = 0-0 = 0 (no error)
► No update since no error
 for T1
► y = 0x0+0x1=0
► e = 1-1 = 1
► w0 = 0 + (1-1) = 1
► w1 = 0 + (1-1) = 1
► Update w0 and w1
 After updating
► for T2
 y = 1x1+1x0=1 => apply threshold = 1
 e = 1-1 = 0
► No update since no error

Copyright (c) by Ando Ki 9

MLP: Multi-layered perceptron (다층 퍼셉트론)

Copyright (c) by Ando Ki 10

5
1/19/2025

Multi-layered perceptron
 Two-unit network (two layers)

X1 H3

O6 Y

X2 w24 H4

XOR
X2

1 Y=1 Y=0

0 X1
0 1

(from Pascal Vincent’s slides)

Layer-wise organization
 3 types of layers  input layer: not counted for the number of
layers
► Input layer
 hidden layer
► hidden layer
 output layer
► output layer

 For the picture on the left

► assume fully connected
► 4-layered including 3-hidden layers
► 16 neurons: 5+4+5+2
► 65 weights: 3x5+5x4+4x5+5x2
 not including bias
► 16 biases: 5+4+5+2
► 82 learnable parameters: 65+16

 Modern neural network

input layer hidden layer output layer ► 10~20 layers, ~100 million parameters
input
feature
bias node neuron
output
neuron ► How about 125 layers?
(class)

fully-connected multi-layered neural network

6
1/19/2025

See neural network topology: http://www.asimovinstitute.org/neural-network-zoo/

Copyright (c) by Ando Ki 13

Popular Frameworks
 Popular Frameworks with supported
interfaces
► Caffe
 Berkeley / BVLC (Berkeley Artificial Intelligence
Research)
 C, C++, Python, Matlab
► TensorFlow
 Google Brain
 C++, Python
► PyTorch
► theano
 U. Montreal
 Python
► torch
 Facebook / NUU
 C, C++, Lua
► CNTK https://blogs.nvidia.com/blog/2016/01/12/accelerating-ai-artificial-intelligence-gpus/
 Microsoft
► MXNet
 Carnegie Mellon University / DMLC (Distributed
Machine Learning Community)
https://developer.nvidia.com/deep-learning-frameworks
Copyright (c) by Ando Ki 14

7
1/19/2025

Popularity

Deep Learning Framework Deep Learning Framework Power Scores (by Jeff Hale) http://bit.ly/2GBa3tU

https://towardsdatascience.com/deep-learning-framework-power-scores-2018-23607ddf297a
Copyright (c) by Ando Ki 15

Table of contents
 Artificial neuron: Perceptron  Neural network
 Artificial neuron: activation functions  Popular types of neural network
 Artificial neural network: ANN  Deep neural net
 Fully connected feed-forward network: FC-  NN categories by applications
FFN
 Optional output layer: Softmax  Popular DNNs and Frameworks
 How to find a good or the best network:
Loss/Cost
 How to find a good or the best network:
Total Lost
 How to minimize total loss by changing [W]
and [b]
 Optimization algorithm: gradient descent
 How to compute gradient

Copyright (c) by Ando Ki 16

8
1/19/2025

Artificial neuron: Perceptron

 Artificial Neuron: Perceptron 𝑊1
► inputs 𝑊2
𝑎1 , 𝑎2 , ⋯ , 𝑎𝐾 × +𝑏 =𝑧
► output ⋮
► weights 𝑊𝐾
► bias
► activation function

Copyright (c) by Ando Ki 17

Artificial neuron: activation functions

Logistic, soft step

y=max(x,0)

Copyright (c) by Ando Ki 18

9
1/19/2025

Artificial Neural Network: ANN

 Artificial Neural Network: ANN
► Network structure by different connections
► Each neuron can has different values of
weights and bias
► Weights and biases are network parameter
.

Neuron

Copyright (c) by Ando Ki 19

Artificial Neural Network: ANN

1 1 1 N: number of inputs
A: number of hidden layers

2 2 2

A B M

w1,1 w2,1 wA,1

w1,2 w2,2 wA,N
X1 X2 XN b1 b2 bA y1 y2 yA
N A A A
x b y w1,N w2,N wA,N
WA
N

w1,1 w2,1 wA,1 T X1 b1 y1

w1,2 w2,2 wA,N X2 b2 y2

w1,N w2,N wA,N XN bA yA

Copyright (c) by Ando Ki 20

10
1/19/2025

Fully connected feed-forward network: FC-FFN

 Activation function: E.g., Sigmoid – S-shaped function

x1=1 (1) (2) (3) y1

1 0 -2

x2=-1 (1) (-1) y2

0 0 2

1, -1
[1, -1] x + [1, 0] = [4, -2] sigmoid [0.98, 0.12]
-2, 1
x1 x2 y1 y2

[0.98, 0.12] x
2, -2
+ [0, 0] =[1.84, -2.08] sigmoid [0.86, 0.11] f ([1, -1]) = [0.62, -0.83]
-1, -1
f ([0, 0]) = [051, 0.85]
3, -1
[0.86, 0.11] x + [-2, 2] = [ ??, ??] sigmoid [ ??, ??]
-1, 4

Copyright (c) by Ando Ki 21

Do it yourself
 Calculate the output

0 (1) (2) (3)

1 0 -2

0 (1) (-1)

0 0 2

Copyright (c) by Ando Ki 22

11
1/19/2025

Optional output layer: Softmax

 Outputs of artificial neural network will be any values from very small to very
large including negative.
f ([1, -1]) = [0.62, -0.83] The output can be any value.
f ([0, 0]) = [051, 0.85] → Hard to interpret.

 Softmax for output layer

► Softmax is a function to transform a number of values to a range of value to between 0 ~ 1.
 Score (-inf, inf) ==> probabilities [0,1]
► Multinomial logistic or normalized exponential function

Where ‘m’ is max{z1,...,zk}

Copyright (c) by Ando Ki 23

Optional output layer: Softmax

 Softmax converts score to probability: Score (-inf, inf) ==> probabilities [0,1]
► un-normalized probabilities (summation will not give 1): for result j.
► normalized probabilities (summation will give 1): -- see below --
Softmax
x1 (1) (2) (3) ez
z1
1 0 -2

x2 (1) (-1) ez
z2
0 0 2

Copyright (c) by Ando Ki 24

12
1/19/2025

Probability and odds and logits

 Let take an example of binary classification
► Classes: C1, C2
► Probability of C1 for given x: y = P(C1|x)
► Probability of C2 for given x: 1 – y = P(C2|x)
► Define ‘odds’ = y/(1-y) = P(C1|x)/(1-P(C1|x))
► Define ‘logits’ = ln(odds) ➔ inverse of ‘sigmoid’.
 logit transforms value between [0:1] to a range of value to between –inf to +inf.

Copyright (c) by Ando Ki 25

Optional output layer: one-hot encoding and argmax

 One-hot encoding  Argmax is an operation that finds the
► One-hot encoding by encoding class labels argument that gives the maximum value
► Select one only among many. from a target function.

Softmax x1 0:0.02
x2 1:0.9
Input values

argmax 1

input xn 9:0.01
(index:out_value)

Copyright (c) by Ando Ki 26

13
1/19/2025

How to find a good or the best network: Loss/Cost

 Loss function is the distance between the network output and the target
► cost function or error function
► It indicates how good the result is.
► There can be different loss functions.
 The simplest one will be a summation of | t – y |.
⚫ Perfect match will give 0.
x1 y1 Indicates dog t1=0

Output values

Target values
x2 y2 Indicates cat t2=1
Input values Network
Parameter

input
(16x16 pixels) x256 y10 Indicates truck t10=0

loss = sum of distances

• Training error: error by training data set
• Generalization error (test error): error by test data set in order to evaluate the training model.
Copyright (c) by Ando Ki 27

How to find a good or the best network: Total Loss

 Total loss (L) is a sum of losses (lr)
► Make it as small as possible
 Training means to find the network parameter that minimize total loss L.
► This means we should modify the network parameter according to the total loss.

y1 t1 l1
total loss = sum of all loss
For all training data

y2 t2 l2

Sum of losses for R

test images
y3 t3 l3

yR tR lR

Copyright (c) by Ando Ki 28

14
1/19/2025

Cost functions (error function)

 Absolute error • y: inference value or calculated value
► Sum of absolute errors • t: target value
 sum(|t – y|)
► Mean absolute errors (MAE)
 sum(|t-y|)/n

error
 Squared error loss
► Sum of squared errors
 sum((t - y)**2)
► Mean squared errors (MSE)
 sum((t-y)**2)/n
► Root mean square errors (RMSE) t-y
 (MSE)**(1/2)
y
 Cross-entropy loss t y
► For classification after Softmax 1 1

► Sum of cross-entropy loss -log(y)

 -sum(t*log(y))
⚫ all except t=1 does not contribute
 or - sum[t*log(y) + (1-t)*log(1-y)]
⚫ add cost when t is not 1.
-log(y) emphasizes error (y) when softmax result (y) is small.
y<1 means error, y=1 means correct.
Copyright (c) by Ando Ki 29

Log plots
import numpy as np
from matplotlib import pyplot as plt

y = np.linspace(-1.5, 1.5, 400)

plt.plot(y, np.log(y), color='blue')

plt.text(0.3, -2, 'log(y)', fontsize=15, color='blue')

plt.plot(y, -np.log(y), color='black')

plt.text(0.2, 2, '-log(y)', fontsize=15, color='black')

plt.plot(y, -np.log(-y), color='red')

plt.text(-0.7, 2, "-log(-y)", fontsize=15, color='red')

plt.plot(y, -np.log(1-y), color='green')

plt.text(1.0, 2, "-log(1-y)", fontsize=15, color='green')

plt.grid()
plt.show()

Copyright (c) by Ando Ki 30

15
1/19/2025

How to minimize total loss by changing [W] and [b]

 If we can find how the network parameters affect the total loss, it may be possible to figure out
how to minimize the total loss.
 However, the number of parameters is too larger to figure out.
► AlexNet: 650K neurons, 8 layers, 60 Million parameters
 So we apply gradual iterative progress method in step by step called ‘Gradient descent’. It is
called optimization algorithm. L: total loss L: total loss
learning rate

<0
>0

W W
Wt=0 Wt=1 Wt=1 Wt=0
Wt=0 - Wt=0 -
► Negative slope ➔ increase W by some function of learning rate
► Positive slope ➔ decrease W
► Steep slope ➔ large change of W for the next time
► go on until the slope is small enough, i.e., inflection point

Copyright (c) by Ando Ki 31

Optimization algorithm: gradient descent

 Initial value problem
► different initial point leads to different minima
L: total loss

 Local minimum problem (get stuck in local minima)

► never guarantee global minima
 Learning rate problem
► large learning rate could cause oscillation
► small learning rate results in slow learning
W1
W2
 Vanishing gradient problem
► If a change in the parameter's value causes very small change in the network's output - the network
just can't learn the parameter effectively, which is a problem.
 Gradient Exploding

Copyright (c) by Ando Ki 32

16
1/19/2025

Popular types of Neural Network (NN)

 DNN: Deep NN
► More general model
► fully connected
► feed-forward (i.e., MLP: multilayer perceptron)
► speech, image processing, natural language processing (NLP)
 CNN: Convolutional NN
► Common image optimization
► connected locally (i.e., sparsely-connected)
► feed-forward
► object/facial recognition
 RNN: Recurrent NN
► context driven, time-series optimization
► variable connectivity
► feed-back in addition to feed-forward
► NLP and speech recognition
► Long Short-Term Memory (LSTM)
 feed-back + storage

Copyright (c) by Ando Ki 33

Deep neural net

 Any continuous function can be realized by
a network with one hidden layer with
sufficient neurons. (Universality theorem,
universal approximation theorem)
► A hidden layer network can represent any
continuous function
► A shallow fat neural net.
얍고 굵다(두껍다)

 Deep thin neural net (deep NN) is better

than shallow fat net.
► Using multiple layers of neurons to represent
some functions are much simpler.
 Less parameters ➔ less computation

깊고 가늘다(얇다)

Copyright (c) by Ando Ki 34

17
1/19/2025

Neural network in brief

Copyright (c) by Ando Ki 35

Neural network in brief

Copyright (c) by Ando Ki 36

18
1/19/2025

Neural network in brief

Copyright (c) by Ando Ki 37

Neural network in brief

Copyright (c) by Ando Ki 38

19
1/19/2025

Neural network in brief

Copyright (c) by Ando Ki 39

Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
Artificial Neural Networks Guide
No ratings yet
Artificial Neural Networks Guide
51 pages
NN Lecture1 Introduction
No ratings yet
NN Lecture1 Introduction
40 pages
NNDL
No ratings yet
NNDL
96 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
221 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
ANN PG Module1
No ratings yet
ANN PG Module1
75 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
ECSE484 Intro v2
No ratings yet
ECSE484 Intro v2
67 pages
12 Neural Network
No ratings yet
12 Neural Network
52 pages
ML Lecture#4
No ratings yet
ML Lecture#4
109 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
Neural Deep Learning
No ratings yet
Neural Deep Learning
221 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
2021 Lecture11 NeuralNetworks
No ratings yet
2021 Lecture11 NeuralNetworks
48 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Week 8 - ANN
No ratings yet
Week 8 - ANN
42 pages
Artificial Neural Networks Basics
No ratings yet
Artificial Neural Networks Basics
50 pages
Lecture2 Slides 1
No ratings yet
Lecture2 Slides 1
28 pages
Deep Learning
No ratings yet
Deep Learning
180 pages
ML Unit 5
No ratings yet
ML Unit 5
33 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Machine Learning
No ratings yet
Machine Learning
77 pages
Wk9-Neural Networks
No ratings yet
Wk9-Neural Networks
46 pages
LCTM and Gru
No ratings yet
LCTM and Gru
62 pages
Unit 3 - Ann
No ratings yet
Unit 3 - Ann
49 pages
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
No ratings yet
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
65 pages
DL IT324a 2 ANN
No ratings yet
DL IT324a 2 ANN
123 pages
Bai Thu Hoach 2
No ratings yet
Bai Thu Hoach 2
10 pages
Module - 2
No ratings yet
Module - 2
33 pages
Soft Max
No ratings yet
Soft Max
6 pages
Deep Learning Algorithms Report PDF
No ratings yet
Deep Learning Algorithms Report PDF
11 pages
ML-Lec10-Artificial Neural Networks
No ratings yet
ML-Lec10-Artificial Neural Networks
76 pages
Data Mining: Ensemble Techniques Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Data Mining: Ensemble Techniques Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
11 pages
Bim309 Ai Week13
No ratings yet
Bim309 Ai Week13
53 pages
7 Neural Networks - Lecture Slides
No ratings yet
7 Neural Networks - Lecture Slides
74 pages
Artificial Intelligence Basics
No ratings yet
Artificial Intelligence Basics
13 pages
Neural Networks
No ratings yet
Neural Networks
33 pages
ch1 of Artificial Newral Network
No ratings yet
ch1 of Artificial Newral Network
20 pages
Chapter 3-1 Neural Network
No ratings yet
Chapter 3-1 Neural Network
43 pages
Artificial Neural Networks (Anns) : Intro
No ratings yet
Artificial Neural Networks (Anns) : Intro
15 pages
Hcai Mock
100% (1)
Hcai Mock
5 pages
15 Neural Network Updated
No ratings yet
15 Neural Network Updated
85 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
2023 Lecture11 NeuralNetworks
No ratings yet
2023 Lecture11 NeuralNetworks
48 pages
Deep Learning: Convolutional Neural Network & Its Applications
No ratings yet
Deep Learning: Convolutional Neural Network & Its Applications
53 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
47 pages
Deep Learning - IIT Ropar
No ratings yet
Deep Learning - IIT Ropar
2 pages
ML Unit-5 Final
No ratings yet
ML Unit-5 Final
23 pages
Performance Analysis of Handwritten Marathi Character Recognition With RBF, Cascade, Elman and Feed Forward Neural Networks
No ratings yet
Performance Analysis of Handwritten Marathi Character Recognition With RBF, Cascade, Elman and Feed Forward Neural Networks
5 pages
An Introduction To Convolutional Neural Networks: Abstract
No ratings yet
An Introduction To Convolutional Neural Networks: Abstract
11 pages
Basics
No ratings yet
Basics
48 pages
ARTIFICIAL NEUERAL NETWORK Notes
No ratings yet
ARTIFICIAL NEUERAL NETWORK Notes
28 pages
Unit 5
No ratings yet
Unit 5
59 pages
3-Intro To Deep Learning and Perceptron
No ratings yet
3-Intro To Deep Learning and Perceptron
43 pages
Feedforward Neural Networks - Part 2 - Parveen Khurana - Medium
No ratings yet
Feedforward Neural Networks - Part 2 - Parveen Khurana - Medium
39 pages
Week 2
No ratings yet
Week 2
47 pages
Neural Network Essentials for Developers
No ratings yet
Neural Network Essentials for Developers
2 pages
CH 12 - Artificial Neural Networks
No ratings yet
CH 12 - Artificial Neural Networks
39 pages
Machine Learning - AI Course
No ratings yet
Machine Learning - AI Course
2 pages
Unit 5
No ratings yet
Unit 5
102 pages
6ee412 ch6 Neural DSP
No ratings yet
6ee412 ch6 Neural DSP
41 pages
Deep Learning and NLP With PYTHON - Course Outline
No ratings yet
Deep Learning and NLP With PYTHON - Course Outline
11 pages
Definition:: Large Language Models (LLMS)
No ratings yet
Definition:: Large Language Models (LLMS)
41 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
Cab112:Introduction To Data Science: Session 2024-25 Page:1/2
No ratings yet
Cab112:Introduction To Data Science: Session 2024-25 Page:1/2
2 pages
Wk. 12. Artificial Neural Networks (12!05!2021)
No ratings yet
Wk. 12. Artificial Neural Networks (12!05!2021)
48 pages
RNNs & Teacher Forcing Explained
No ratings yet
RNNs & Teacher Forcing Explained
121 pages
Neural NetworksChapter2Sup
No ratings yet
Neural NetworksChapter2Sup
20 pages
Deep Learning Model Setup
No ratings yet
Deep Learning Model Setup
29 pages
Unit 1 Introduction To Neural Networks Cleaned
100% (1)
Unit 1 Introduction To Neural Networks Cleaned
4 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Fayyaz 2020
No ratings yet
Fayyaz 2020
5 pages
Deep Generative Models
No ratings yet
Deep Generative Models
55 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Unit II-NNDL
No ratings yet
Unit II-NNDL
19 pages
Unit 2
No ratings yet
Unit 2
93 pages
Assignment 3
No ratings yet
Assignment 3
6 pages
Unit 1
No ratings yet
Unit 1
29 pages
Chapter-4 Fundamental of Neural Network
No ratings yet
Chapter-4 Fundamental of Neural Network
26 pages
ccs355 Model-B
No ratings yet
ccs355 Model-B
4 pages
IDEC Systemverilog를 이용한 검증 방법론 고려대 이중회
No ratings yet
IDEC Systemverilog를 이용한 검증 방법론 고려대 이중회
242 pages
Verilog Simple SoC
No ratings yet
Verilog Simple SoC
94 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
62 pages
Day1 07 Introduction To CNN
No ratings yet
Day1 07 Introduction To CNN
27 pages
ML Merged PDF
No ratings yet
ML Merged PDF
14 pages
Day1 06 Simple NN Python
No ratings yet
Day1 06 Simple NN Python
18 pages
Weekly Report
No ratings yet
Weekly Report
6 pages
DL Unit-1 San
No ratings yet
DL Unit-1 San
58 pages
Machine Learning Engineer Roadmap (2025 Edition) : Step 1: Learn Programming & Math Basics
No ratings yet
Machine Learning Engineer Roadmap (2025 Edition) : Step 1: Learn Programming & Math Basics
3 pages
COMP3411 Week 3 - NN
No ratings yet
COMP3411 Week 3 - NN
70 pages

Day1 05 Introduction To DeepLearning Part

Uploaded by

Day1 05 Introduction To DeepLearning Part

Uploaded by

1/19/2025

Introduction to Deep Learning

Ando Ki, Ph.D.

Copyright (c) by Ando Ki 2

Copyright (c) by Ando Ki 4

Perceptron: single layer neural network

threshold activation function.

► Perceptron is a linear classifier.

Copyright (c) by Ando Ki 5

How perceptron classifies hyperplane

X1 W1=1 X1 W1=1 X1 W1=1

X2 W2=1 X2 W2=1 X2 W2=1

AND OR NOT XOR

Copyright (c) by Ando Ki 7

Perceptron: Boolean AND training

Copyright (c) by Ando Ki 8

Perceptron: Boolean OR training

Copyright (c) by Ando Ki 9

MLP: Multi-layered perceptron (다층 퍼셉트론)

Copyright (c) by Ando Ki 10

(from Pascal Vincent’s slides)

 For the picture on the left

 Modern neural network

fully-connected multi-layered neural network

Categories of ANN (Artificial Neural network)

See neural network topology: http://www.asimovinstitute.org/neural-network-zoo/

Copyright (c) by Ando Ki 13

Copyright (c) by Ando Ki 16

Artificial neuron: Perceptron

Copyright (c) by Ando Ki 17

Artificial neuron: activation functions

Copyright (c) by Ando Ki 18

Artificial Neural Network: ANN

Copyright (c) by Ando Ki 19

Artificial Neural Network: ANN

w1,1 w2,1 wA,1

w1,1 w2,1 wA,1 T X1 b1 y1

w1,N w2,N wA,N XN bA yA

Copyright (c) by Ando Ki 20

Fully connected feed-forward network: FC-FFN

x1=1 (1) (2) (3) y1

x2=-1 (1) (-1) y2

Copyright (c) by Ando Ki 21

0 (1) (2) (3)

Copyright (c) by Ando Ki 22

Optional output layer: Softmax

 Softmax for output layer

Where ‘m’ is max{z1,...,zk}

Copyright (c) by Ando Ki 23

Optional output layer: Softmax

Copyright (c) by Ando Ki 24

Probability and odds and logits

Copyright (c) by Ando Ki 25

Optional output layer: one-hot encoding and argmax

Copyright (c) by Ando Ki 26

How to find a good or the best network: Loss/Cost

loss = sum of distances

How to find a good or the best network: Total Loss

Sum of losses for R

Copyright (c) by Ando Ki 28

Cost functions (error function)

► Sum of cross-entropy loss -log(y)

y = np.linspace(-1.5, 1.5, 400)

plt.plot(y, np.log(y), color='blue')

plt.plot(y, -np.log(y), color='black')

plt.plot(y, -np.log(-y), color='red')

plt.plot(y, -np.log(1-y), color='green')

Copyright (c) by Ando Ki 30

How to minimize total loss by changing [W] and [b]

Copyright (c) by Ando Ki 31

Optimization algorithm: gradient descent

 Local minimum problem (get stuck in local minima)

Copyright (c) by Ando Ki 32

Popular types of Neural Network (NN)

Copyright (c) by Ando Ki 33

Deep neural net