Understanding Activation Functions
A Step-by-Step Guide with Examples
Sridhar
August 6, 2025
Introduction
Activation functions are mathematical operations that determine the output of a neural network. This guide
explains five important activation functions with detailed examples.
1 Linear Activation Function
1.1 Formula
f (z) = z
1.2 Properties
• Output equals input
• Range: (−∞, ∞)
• Used in regression problems
1.3 Example Problems
1.3.1 Problem 1
Given input vector z = [2.0, −1.5, 3.0], find the output.
Solution:
a = [2.0, −1.5, 3.0]
1.3.2 Problem 2
Calculate the output for:
Weights w = [0.5, −0.3, 1.2]
Bias b = −0.1
Input x = [1.0, 2.0, −1.0]
Solution:
z = (0.5 × 1.0) + (−0.3 × 2.0) + (1.2 × −1.0) + (−0.1)
= 0.5 − 0.6 − 1.2 − 0.1
= −1.4
a = z = −1.4
1
2 Sigmoid (Logistic) Activation Function
2.1 Formula
1
σ(z) =
1 + e−z
2.2 Properties
• Output between 0 and 1
• Used for binary classification
• ”S”-shaped curve
2.3 Example Problems
2.3.1 Problem 1
Calculate σ(1.0).
Solution:
e−1.0 ≈ 0.3679
1 + 0.3679 = 1.3679
1
σ(1.0) = ≈ 0.731
1.3679
2.3.2 Problem 2
Find outputs for z = [0.5, −1.0, 2.0].
Solution:
1 1
σ(0.5) = −0.5
≈ ≈ 0.622
1+e 1.6065
1 1
σ(−1.0) = 1.0
≈ ≈ 0.269
1+e 3.7183
1 1
σ(2.0) = ≈ ≈ 0.881
1 + e−2.0 1.1353
a ≈ [0.622, 0.269, 0.881]
3 Tanh (Hyperbolic Tangent) Activation Function
3.1 Formula
ez − e−z
tanh(z) =
ez + e−z
3.2 Properties
• Output between -1 and 1
• Zero-centered
• Stronger gradient than sigmoid
2
3.3 Example Problems
3.3.1 Problem 1
Calculate tanh(0.8).
Solution:
e0.8 ≈ 2.2255
e−0.8 ≈ 0.4493
Numerator = 2.2255 − 0.4493 = 1.7762
Denominator = 2.2255 + 0.4493 = 2.6748
1.7762
tanh(0.8) = ≈ 0.664
2.6748
3.3.2 Problem 2
Find outputs for z = [−0.5, 1.5, 0.0].
Solution:
tanh(−0.5) ≈ −0.462
tanh(1.5) ≈ 0.905
tanh(0.0) = 0
a ≈ [−0.462, 0.905, 0.0]
4 ReLU (Rectified Linear Unit) Activation Function
4.1 Formula
ReLU(z) = max(0, z)
4.2 Properties
• Output is 0 for negative inputs
• Linear for positive inputs
• Computationally efficient
4.3 Example Problems
4.3.1 Problem 1
Calculate ReLU(-2.3) and ReLU(1.7).
Solution:
ReLU(−2.3) = max(0, −2.3) = 0
ReLU(1.7) = max(0, 1.7) = 1.7
4.3.2 Problem 2
Find outputs for z = [−1.0, 0.5, 3.0, −0.2].
Solution:
a = [0, 0.5, 3.0, 0]
3
5 Softmax Activation Function
5.1 Formula
ezi
Softmax(zi ) = Pn
j=1 ez j
5.2 Properties
• Outputs sum to 1
• Used for multi-class classification
• Amplifies differences between values
5.3 Example Problems
5.3.1 Problem 1
Calculate softmax for z = [1.0, 2.0, 3.0].
Solution:
e1.0 ≈ 2.718
e2.0 ≈ 7.389
e3.0 ≈ 20.085
Sum = 2.718 + 7.389 + 20.085 = 30.192
2.718
Softmax(1.0) = ≈ 0.090
30.192
7.389
Softmax(2.0) = ≈ 0.245
30.192
20.085
Softmax(3.0) = ≈ 0.665
30.192
a ≈ [0.090, 0.245, 0.665]
5.3.2 Problem 2
Calculate softmax for z = [0.5, −0.5, 1.0].
Solution:
e0.5 ≈ 1.648
e−0.5 ≈ 0.606
e1.0 ≈ 2.718
Sum = 1.648 + 0.606 + 2.718 = 4.972
1.648
Softmax(0.5) = ≈ 0.331
4.972
0.606
Softmax(−0.5) = ≈ 0.122
4.972
2.718
Softmax(1.0) = ≈ 0.547
4.972
a ≈ [0.331, 0.122, 0.547]
Most probable class: 3 (0.547)
4
Summary Table
Function Formula Range Example
Linear f (z) = z (−∞, ∞) 2.0 → 2.0
Sigmoid 1
1+e−z (0, 1) 1.0 → 0.731
ez −e−z
Tanh ez +e−z (-1, 1) 0.8 → 0.664
ReLU max(0, z) [0, ∞) -2.3 → 0
zi
Softmax Pe z
e j
(0, 1) [1,2,3] → [0.09,0.245,0.665]
Conclusion
This guide has explained five important activation functions with step-by-step examples. Remember:
• Linear: No transformation
• Sigmoid: For probabilities (0 to 1)
• Tanh: Similar to sigmoid but (-1 to 1)
• ReLU: Simple and effective
• Softmax: For multi-class probabilities