0% found this document useful (0 votes)

12 views45 pages

Lecture05 DeepLearningCNN Trang 2

The document discusses specialized convolutional layers aimed at improving efficiency and reducing computational costs in convolutional neural networks (CNNs). It covers various types of specialized convolutions, including depthwise, grouped, pointwise, and depthwise separable convolutions, highlighting their benefits and applications. Additionally, it reviews significant CNN architectures like LeNet-5, AlexNet, VGG-16, and GoogLeNet, emphasizing their innovations and impacts on the field of deep learning and computer vision.

Uploaded by

Nguyễn Hoài Nam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views45 pages

Lecture05 DeepLearningCNN Trang 2

Uploaded by

Nguyễn Hoài Nam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Specialized Convolutional Layers: Motivation

• Computational Cost of Standard Convolutions: Standard convolutions can be computationally

demanding, especially with many channels, large kernels, or high-resolution inputs.
• Need for Efficiency: Resource-constrained applications (e.g., mobile, embedded systems) require
more efficient alternatives.
• Benefits of Specialized Convolutions: These offer:
• Reduced computational cost (fewer parameters and FLOPs).
• Improved efficiency (faster inference/training).
• Potential performance gains.
• Types of Specialized Convolutions (covered next):
• Depthwise Convolution
• Grouped Convolution
• Pointwise Convolution (1x1)
• Depthwise Separable Convolution

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 40 / 84

Standard Convolution: Recap

• Input: H × W × Cin
• Kernel: K × K × Cin
• Number of filters: Cout
• Output: H ′ × W ′ × Cout
• Number of parameters: K × K × Cin × Cout
• FLOPs: H ′ × W ′ × K × K × Cin × Cout

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 41 / 84

Depthwise Convolution
• Applies a single filter to each input channel independently.
• Input: H × W × Cin
• Kernel: K × K × 1 (one filter per channel)
• Output: H ′ × W ′ × Cin (same number of channels as input)
• Number of parameters: K × K × Cin
• FLOPs: H ′ × W ′ × K × K × Cin
• Much more efficient than standard convolution, especially when Cout >> 1.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 42 / 84

Pointwise Convolution (1x1 Convolution)
• Input: H × W × Cin
• Kernel: 1 × 1 × Cin - 1x1 kernel to perform a linear combination of the input channels.
• Number of filters: Cout
• Output: H × W × Cout (spatial dimensions remain the same)
• Number of parameters: 1 × 1 × Cin × Cout = Cin × Cout
• FLOPs: H × W × Cin × Cout
• Used for:
• Reducing/increasing the number of channels.
• Adding non-linearity after depthwise convolution.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 43 / 84

Grouped Convolution
• Divides the input channels into G groups and applies standard conv independently within each
group (depthwise convolution is a special case of grouped convolution where G = Cin )
• Input: H × W × Cin
• Kernel: K × K × Cin
G
• Number of filters per group: Cout
G
• Output: H ′ × W ′ × Cout
• Number of parameters: K × K × Cin Cout Cout
G × G × G = K × K × Cin × G
• FLOPs: H ′ × W ′ × K × K × Cin × Cout
G

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 44 / 84

Depthwise Separable Convolution
• Combines depthwise and pointwise convolutions.
• First, a depthwise convolution is applied.
• Then, a pointwise convolution is used to combine the output channels.
• Significantly reduces computational cost compared to standard convolution.
• Used in MobileNet and Xception.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 45 / 84

Convolutional Layers: Animated Explanation

Groups, Depthwise, and Depthwise-Separable Convolution

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 46 / 84

Backbone CNN Models: Review

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 47 / 84

Introduction to LeNet-5
• Historical Significance: LeNet-5, developed by Yann LeCun et al. in the 1990s, is one of the
earliest and most influential Convolutional Neural Network (CNN) architectures.
• Purpose: Designed for handwritten and machine-printed character recognition (e.g., MNIST
dataset).
• Key Innovations: Introduced fundamental CNN concepts:
• Convolutional layers with learnable weights.
• Local receptive fields.
• Spatial subsampling (pooling).
• Shared weights (parameter sharing).

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 48 / 84

LeNet-5 Architecture

• LeNet-5 consists of seven layers (excluding the input):

• Input Layer: 32x32 grayscale image.
• Convolutional Layer C1: 6 5x5 filters, stride 1, no padding. Output: 28x28x6.
• Subsampling Layer S2 (Average Pooling): 2x2 pooling, stride 2. Output: 14x14x6.
• Convolutional Layer C3: 16 5x5 filters. Output: 10x10x16. Note: the connections between feature
maps in S2 and C3 are not fully connected in the original LeNet-5 paper.
• Subsampling Layer S4 (Average Pooling): 2x2 pooling, stride 2. Output: 5x5x16.
• Fully Connected Layer F5: 120 neurons.
• Fully Connected Layer F6: 84 neurons.
• Output Layer: 10 neurons (one for each digit 0-9) with RBF (Radial Basis Function) or Softmax
activation.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 49 / 84

Key Concepts and Impact

Key concepts:
• Convolutional Layers: Local receptive fields, feature extraction.
• Subsampling (Pooling): Reducing spatial resolution, increasing robustness to small shifts and distortions.
• Parameter Sharing: Reducing the number of parameters and improving generalization.
• Hierarchical Feature Learning: Lower layers detect simple features (edges, lines), higher layers detect more
complex features (combinations of edges, shapes).
Impacts:
• LeNet-5 laid the foundation for modern CNN architectures.
• Its key concepts are still used in many state-of-the-art models.
• It demonstrated the power of CNNs for image recognition and other tasks involving structured data.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 50 / 84

Introduction to AlexNet
• Revolutionary Impact: AlexNet, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey
Hinton, won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012 by a
significant margin, marking a turning point in DL for computer vision.
• Key Contributions:
• Deeper architecture than previous CNNs.
• Use of ReLU activation functions.
• Training on GPUs for faster training.
• Local response normalization (LRN).
• Overlapping pooling and data augmentation

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 51 / 84

AlexNet Architecture
AlexNet consists of eight layers (excluding the input):
• Input Layer: 227x227x3 RGB image.
• Convolutional Layer 1: 96 11x11 filters, stride 4, no padding. Output: 55x55x96.
• Max Pooling Layer 1: 3x3 pooling, stride 2. Output: 27x27x96.
• Convolutional Layer 2: 256 5x5 filters, stride 1, padding 2. Output: 27x27x256.
• Max Pooling Layer 2: 3x3 pooling, stride 2. Output: 13x13x256.
• Convolutional Layer 3: 384 3x3 filters, stride 1, padding 1. Output: 13x13x384.
• Convolutional Layer 4: 384 3x3 filters, stride 1, padding 1. Output: 13x13x384.
• Convolutional Layer 5: 256 3x3 filters, stride 1, padding 1. Output: 13x13x256.
• Max Pooling Layer 3: 3x3 pooling, stride 2. Output: 6x6x256.
• Fully Connected Layer 1: 4096 neurons.
• Fully Connected Layer 2: 4096 neurons.
• Output Layer: 1000 neurons (for 1000 ImageNet classes) with Softmax activation.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 52 / 84

Key Innovations and Impact

Key Innovations:
• ReLU Activations: Accelerated training by mitigating vanishing gradients.
• GPU Training: Enabled training of larger models on larger datasets.
• Local Response Normalization (LRN): Local channel normalization (minor impact).
• Overlapping Pooling: Reduced overfitting.
• Data Augmentation: Improved generalization by increasing training data diversity.
Impact:
• Deep Learning Resurgence in CV: Sparked renewed interest and rapid progress in deep learning for
computer vision.
• Foundation for Modern CNNs: Influenced many subsequent CNN architectures.
• Influence on Other Fields: Impacted other areas of deep learning like NLP and speech recognition.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 53 / 84

Introduction to VGG-16
• Visual Geometry Group (VGG): Developed by the VGG at the University of Oxford.
• Key Insight: Demonstrated the importance of network depth in achieving better performance in
image classification.
• Uniform Architecture: Used very small (3x3) convolutional filters throughout the entire network,
leading to a much deeper architecture than AlexNet.
• ILSVRC 2014: Achieved top performance in the ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) 2014.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 54 / 84

VGG-16 Architecture
• Key Characteristics:
• Only 3x3 convolutional filters with stride 1 and padding 1 are used.
• 2x2 max pooling with stride 2 is used for downsampling.
• Multiple convolutional layers are stacked before each pooling layer.
• Layers (simplified): VGG-16 refers to 16 layers with weights (convolutional or fully connected):
• Input: 224x224x3 RGB image.
• Conv1 (2 layers): 64 filters. Output: 224x224x64
• Max Pool 1: Output: 112x112x64
• Conv2 (2 layers): 128 filters. Output: 112x112x128
• Max Pool 2: Output: 56x56x128
• Conv3 (3 layers): 256 filters. Output: 56x56x256
• Max Pool 3: Output: 28x28x256
• Conv4 (3 layers): 512 filters. Output: 28x28x512
• Max Pool 4: Output: 14x14x512
• Conv5 (3 layers): 512 filters. Output: 14x14x512
• Max Pool 5: Output: 7x7x512
• FC1: 4096 neurons
• FC2: 4096 neurons
• Output (FC3): 1000 neurons (ImageNet classes) with Softmax

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 55 / 84

VGG-16 vs VGG-19

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 56 / 84

Advantages of Small 3x3 Convolutions

• Deeper Network: Stacking multiple 3x3 convolutions allows for a deeper network, which
can learn more complex features.
• Reduced Number of Parameters: Two stacked 3x3 convolutions have the same
receptive field as one 5x5 convolution but with fewer parameters:
• One 5x5: 5 × 5 = 25 parameters
• Two 3x3: (3 × 3) + (3 × 3) = 18 parameters
• More Non-linearities: Stacking more layers increases the number of non-linear
activations (ReLU), which makes the network more expressive.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 57 / 84

Impact of VGG Networks

• Emphasis on Depth: Solidified the importance of network depth for achieving high
performance.
• Simple and Effective Design: The uniform architecture with small filters made VGG
networks easy to understand and implement.
• Transfer Learning: VGG models pretrained on ImageNet became widely used for transfer
learning in various computer vision tasks.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 58 / 84

Introduction to GoogLeNet

• ILSVRC 2014 Winner: GoogLeNet, developed by Google, won the ImageNet Large Scale Visual
Recognition Challenge (ILSVRC) 2014, achieving a significant improvement over previous
architectures (including VGG).
• Key Innovation: Inception Module: Introduced the Inception module, a novel building block
that significantly improved efficiency and performance.
• Depth and Efficiency: Achieved greater depth than previous networks while maintaining
manageable computational cost.
• Reduced Parameters: Significantly fewer parameters than AlexNet, making it more efficient and
less prone to overfitting.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 59 / 84

The Inception Module
• Motivation: To capture features at multiple scales simultaneously.
• Structure: Consists of parallel branches with different convolutional filter sizes (1x1, 3x3, 5x5)
and max pooling.
• 1x1 Convolutions: Used 1x1 convolutions for dimensionality reduction before the more expensive
3x3 and 5x5 convolutions, significantly reducing computational cost.
• Concatenation: The outputs of all branches are concatenated along the channel dimension.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 60 / 84

GoogLeNet Architecture
• Stacking Inception Modules: GoogLeNet consists of multiple Inception modules stacked on top
of each other.
• Auxiliary Classifiers: Included auxiliary classifiers at intermediate layers to improve gradient flow
during training and prevent vanishing gradients.
• No Fully Connected Layers at the End: Used Global Average Pooling (GAP) at the end
instead of fully connected layers, further reducing the number of parameters.
• Simplified Structure (Conceptual): Input - Several Convolutional Layers - Several
Convolutional Layers - Global Average Pooling - Softmax Output

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 61 / 84

Advantages of GoogLeNet

• Increased Depth and Width: The Inception module allows for increasing both the
depth and width of the network without a significant increase in computational cost.
• Computational Efficiency: Using 1x1 convolutions for dimensionality reduction
significantly reduces the number of parameters and FLOPs.
• Improved Performance: Achieved state-of-the-art performance on ImageNet with
significantly fewer parameters than previous models.
• Reduced Overfitting: The reduced number of parameters and the use of auxiliary
classifiers helped to reduce overfitting.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 62 / 84

Impact of GoogLeNet

• Shift Towards Efficient Architectures: Influenced the development of more efficient

CNN architectures.
• Inception Module as a Building Block: The Inception module became a popular
building block in many subsequent CNNs.
• Focus on Computational Cost: Highlighted the importance of considering
computational cost in deep learning model design.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 63 / 84

Introduction to Inception-v3

• Evolution of Inception: Inception-v3 is the third iteration of the Inception architecture,

building upon the ideas introduced in GoogLeNet (Inception-v1).
• Focus on Efficiency and Performance: Aimed to further improve both computational
efficiency and classification performance.
• Key Improvements: Introduced several architectural refinements:
• Factorization of larger convolutions into smaller ones.
• Asymmetric convolutions.
• Auxiliary classifiers with improved loss.
• Batch Normalization in auxiliary classifiers.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 64 / 84

Factorization of Convolutions
• Factorizing 5x5 Convolutions: A 5x5 convolution can be factorized into two consecutive 3x3
convolutions, reducing the number of parameters and computations:
• One 5x5: 5 × 5 = 25 parameters
• Two 3x3: (3 × 3) + (3 × 3) = 18 parameters
This increases depth, adding more non-linearities (ReLU activations) and thus increasing the
network’s expressiveness.
• Factorizing n × n Convolutions: More generally, any n × n convolution can be factorized into a
sequence of 1 × n and n × 1 convolutions. For example, a 3x3 convolution can be factorized into a
1x3 followed by a 3x1.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 65 / 84

Asymmetric Convolutions

• Further Factorization: Inception-v3 further factorizes convolutions by using asymmetric

convolutions, such as 1xn followed by nx1.
• Example: Instead of a 3x3 convolution, Inception-v3 uses a 1x3 convolution followed by a 3x1
convolution.
• Benefits: This further reduces the number of parameters and computations compared to using
two 3x3 convolutions.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 66 / 84

Improved Auxiliary Classifiers
• Purpose of Auxiliary Classifiers: To improve gradient flow during training, especially in very
deep networks, and prevent vanishing gradients.
• Improvements in v3: In Inception-v3, the auxiliary classifiers were improved by:
• Using batch normalization in the auxiliary classifiers.
• Using a different loss function (softmax cross-entropy) for the auxiliary classifiers.
• Contribution to Final Loss: The loss from the auxiliary classifiers is added to the main loss with
a smaller weight (e.g., 0.3).

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 67 / 84

Overall Impact of Inception-v3

• State-of-the-Art Performance: Achieved even better performance on ImageNet

compared to its predecessors.
• Emphasis on Efficient Design: Further emphasized the importance of efficient network
design.
• Influence on Subsequent Architectures: Influenced the design of many subsequent
CNN architectures by demonstrating the effectiveness of factorization and asymmetric
convolutions.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 68 / 84

Introduction to ResNet

• Challenge of Deep Networks: Training very deep neural networks was a major challenge due to
the vanishing gradient problem.
• Key Innovation: Residual Connections (Skip Connections): ResNet, introduced by He et al.,
addressed this problem with the concept of residual connections (also known as skip connections
or shortcuts).
• ILSVRC 2015 Winner: Achieved state-of-the-art results on ImageNet in 2015, surpassing
human-level performance on the classification task.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 69 / 84

The Vanishing Gradient Problem
• Gradient Propagation: During backpropagation, gradients are multiplied as they are passed
through multiple layers.
• Vanishing Gradients: In very deep networks, these repeated multiplications can cause the
gradients to become extremely small, effectively preventing the earlier layers from learning.
• Impact: This makes it difficult to train very deep networks effectively.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 70 / 84

Residual Connections
• Concept: Instead of directly learning a mapping H(x), ResNet learns a residual mapping
F (x) = H(x) − x, where x is the input to the layer.
• Residual Block: The output of a residual block is then H(x) = F (x) + x. The addition is
performed using element-wise addition.
• Identity Mapping: If the identity mapping is optimal, the network can easily learn it by setting
F (x) = 0.
• Gradient Flow: Residual connections provide a direct path for gradients to flow through,
mitigating the vanishing gradient problem.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 71 / 84

ResNet Architectures
• Different Depths: ResNet comes in various depths (e.g., ResNet-18, ResNet-34, ResNet-50,
ResNet-101, ResNet-152), with the number indicating the number of layers.
• Bottleneck Layers: Deeper ResNet architectures (e.g., ResNet-50 and above) use bottleneck
layers to reduce computational cost. A bottleneck layer consists of a 1x1 convolution, a 3x3
convolution, and another 1x1 convolution.
• Overall Structure (General):
1. Input Convolution and Pooling
2. Several Blocks of Residual Layers (repeated)
3. Global Average Pooling
4. Fully Connected Layer (for classification)

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 72 / 84

Benefits and Impact of ResNet

• Training Very Deep Networks: Enabled the training of significantly deeper networks
than previously possible.
• Improved Performance: Achieved state-of-the-art results on various computer vision
tasks.
• Foundation for Future Architectures: The concept of residual connections has become
a fundamental building block in many subsequent CNN architectures.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 73 / 84

MobileNet: Efficient Mobile-First CNNs

• Key Idea: Focuses on extreme computational

efficiency for mobile and embedded devices.
• Key Components:
• Depthwise Separable Convolutions:
Factorizes standard convolutions into
depthwise and pointwise convolutions to
significantly reduce computation.
• Width Multiplier: A hyperparameter to
control the number of channels, further
reducing computation.
• Resolution Multiplier: A hyperparameter to
control the input image resolution, also
impacting computation.
• Goal: Achieve a good balance between accuracy
and latency/model size.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 74 / 84

DenseNet: Dense Connections for Feature Reuse
• Idea: Maximizes information flow between layers by connecting each layer to all preceding layers.
• Dense Blocks: Each layer receives feature maps from all preceding layers as input and passes its
own feature maps to all subsequent layers.
• Benefits:
• Strong feature reuse, leading to more compact models.
• Mitigates vanishing gradient problem.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 75 / 84

SENet (Squeeze-and-Excitation Networks): Channel Attention

• Key Idea: Introduces channel-wise attention mechanisms to dynamically recalibrate channel-wise

feature responses.
• Key Component: Squeeze-and-Excitation (SE) Block:
• Squeeze: Global average pooling to obtain channel-wise statistics.
• Excitation: Two fully connected layers with a sigmoid activation to learn channel-wise weights.
• Scale: Element-wise multiplication of the channel weights with the original feature maps.

• Benefit: Improves feature discrimination by emphasizing important channels and suppressing less
important ones.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 76 / 84

ResNeXt: Aggregated Residual Transformations
• Key Idea: Extends ResNet by replicating multiple parallel paths (transformations) within each
residual block, aggregating their outputs.
• Key Component: Cardinality: The number of parallel paths, acting as a new dimension besides
depth and width.
• Benefit: Improves performance by exploring a richer set of transformations while maintaining
computational efficiency.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 77 / 84

Recent Cutting-Edge Models (Brief Overview)

• EfficientNet: Focuses on compound scaling of network width, depth, and resolution using a
principled approach.
• RegNet: Explores network design space using a population-based search to find optimal
architectures.
• Vision Transformers (ViT): Applies the Transformer architecture from NLP to image
classification, treating images as sequences of patches.
• ConvNeXt: A modern take on the classical ConvNet design inspired by the Transformer
architecture, showing the strong potential of carefully designed ConvNets.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 78 / 84

Performance: Accuracy vs Complexity

A good neural network has a high accuracy and is fast.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 79 / 84
Python Code - Image Classification (Part 01)
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import ResNet50V2 # Example: ResNet50V2
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model
import matplotlib.pyplot as plt
import numpy as np
# Data paths (adjust these for your data)
train_dir = ’/content/drive/MyDrive/Colab Notebooks/final_project_dataset/training_set’
validation_dir = ’/content/drive/MyDrive/Colab Notebooks/final_project_dataset/test_set’
IMG_SIZE = (224, 224) # ResNet50V2 input size
# Data augmentation and preprocessing
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True
)
Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 80 / 84
Python Code - Image Classification (Part 02)
validation_datagen = ImageDataGenerator(rescale=1./255)

try:
# Attempt to create data generators
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=IMG_SIZE,
batch_size=32,
class_mode=’categorical’ # or ’binary’ if you have two classes
)

validation_generator = validation_datagen.flow_from_directory(
validation_dir,
target_size=IMG_SIZE,
batch_size=32,
class_mode=’categorical’ # or ’binary’ if you have two classes
)
except OSError as e:
print(f"Error creating data generators: {e}")
raise # Re-raise to stop execution on data generator errors

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 81 / 84

Python Code - Image Classification (Part 03)

# Load pre-trained model (ResNet50V2 in this example)

base_model = ResNet50V2(
weights=’imagenet’,
include_top=False, # Exclude the classification layer
input_shape=IMG_SIZE + (3,)
)

# Freeze the base model layers

base_model.trainable = False

# Add custom classification head

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation=’relu’)(x) # Add a dense layer
predictions = Dense(train_generator.num_classes, activation=’softmax’)(x) # Output layer

model = Model(inputs=base_model.input, outputs=predictions)

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 82 / 84

Python Code - Image Classification (Part 04)
# Compile the model
model.compile(optimizer=’adam’, loss=’categorical_crossentropy’, metrics=[’accuracy’]) # Adjust loss

# Train the model

epochs = 10 # Adjust as needed
try:
history = model.fit(
train_generator,
steps_per_epoch=train_generator.samples // train_generator.batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=validation_generator.samples // validation_generator.batch_size
)
except Exception as e: # Catch any training errors
print(f"Error during training: {e}")

# Access training history (outside the try block)

acc = history.history[’accuracy’]
val_acc = history.history[’val_accuracy’]
loss = history.history[’loss’]
val_loss = history.history[’val_loss’]

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 83 / 84

Python Code - Image Classification (Part 05)

epochs_range = range(epochs)

plt.figure(figsize=(15, 5))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label=’Training Accuracy’)
plt.plot(epochs_range, val_acc, label=’Validation Accuracy’)
plt.legend(loc=’lower right’)
plt.title(’Training and Validation Accuracy’)

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label=’Training Loss’)
plt.plot(epochs_range, val_loss, label=’Validation Loss’)
plt.legend(loc=’upper right’)
plt.title(’Training and Validation Loss’)
plt.show()

# Save the model

model.save(’image_classifier_model.h5’)

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 84 / 84

Focus 4 Test 1 GR A
80% (5)
Focus 4 Test 1 GR A
4 pages
Lecture05 DeepLearningCNN Trang 1
No ratings yet
Lecture05 DeepLearningCNN Trang 1
39 pages
Lecture05 DeepLearningCNN
No ratings yet
Lecture05 DeepLearningCNN
84 pages
AI Foundation Application-2
No ratings yet
AI Foundation Application-2
84 pages
Ch-3 Convolutional Neural Networks (CNNS)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNS)
11 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
Modern CNN Architectures
No ratings yet
Modern CNN Architectures
32 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
CNN Architectures Workshop
No ratings yet
CNN Architectures Workshop
104 pages
VGG Net
No ratings yet
VGG Net
6 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
CNNs: A Guide for Tech Enthusiasts
No ratings yet
CNNs: A Guide for Tech Enthusiasts
80 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
CNN (1) - Unit 3 - Merged
No ratings yet
CNN (1) - Unit 3 - Merged
14 pages
Deep CNN
No ratings yet
Deep CNN
66 pages
Lec 6
No ratings yet
Lec 6
31 pages
CNNs vs Fully Connected Networks
No ratings yet
CNNs vs Fully Connected Networks
6 pages
Kernel Slides
No ratings yet
Kernel Slides
33 pages
Unit 2 CNN
No ratings yet
Unit 2 CNN
15 pages
CNN 2
No ratings yet
CNN 2
47 pages
Unit III
No ratings yet
Unit III
58 pages
CNN Architecture
No ratings yet
CNN Architecture
6 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
Modern Convolutional Neural Networks
No ratings yet
Modern Convolutional Neural Networks
68 pages
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
No ratings yet
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
8 pages
Week3 Lec1 2
No ratings yet
Week3 Lec1 2
107 pages
Deep Learning for CS Students
No ratings yet
Deep Learning for CS Students
75 pages
Lecture2 Advanced CNN
No ratings yet
Lecture2 Advanced CNN
55 pages
Notes
No ratings yet
Notes
15 pages
Classic CNN
No ratings yet
Classic CNN
39 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
465-Lecture 7
No ratings yet
465-Lecture 7
46 pages
Unit Iv - NNDL
No ratings yet
Unit Iv - NNDL
32 pages
Unit 3
No ratings yet
Unit 3
14 pages
Convolutional Nets
No ratings yet
Convolutional Nets
41 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
Convolutional Neural Network2 26112024 015227pm
No ratings yet
Convolutional Neural Network2 26112024 015227pm
41 pages
DEEP LEARNING Unit-2 NOTES For Post Graduation
No ratings yet
DEEP LEARNING Unit-2 NOTES For Post Graduation
11 pages
5b Dana
No ratings yet
5b Dana
67 pages
3 DL ConvNets
No ratings yet
3 DL ConvNets
46 pages
Module 3 B
No ratings yet
Module 3 B
40 pages
Some Important Question
No ratings yet
Some Important Question
59 pages
138 B Pretrained Networks Classification Complete
No ratings yet
138 B Pretrained Networks Classification Complete
47 pages
Convolutional Neural Networks: 1. Basics of Cnns
No ratings yet
Convolutional Neural Networks: 1. Basics of Cnns
8 pages
CNN Fundamentals & Case Studies
No ratings yet
CNN Fundamentals & Case Studies
27 pages
Images and Convolutional Neural Networks: Practical Deep Learning
No ratings yet
Images and Convolutional Neural Networks: Practical Deep Learning
34 pages
CSCI417 Machine Intelligence - Lec11 RNN - V1
No ratings yet
CSCI417 Machine Intelligence - Lec11 RNN - V1
61 pages
DL Unit Iv
No ratings yet
DL Unit Iv
18 pages
CNN Basic
No ratings yet
CNN Basic
64 pages
DL UNIT 2 CNN Architectures
No ratings yet
DL UNIT 2 CNN Architectures
12 pages
Convnets 3
No ratings yet
Convnets 3
17 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
15 pages
Convolutional Neural Network (CNN) Architectures - GeeksforGeeks
No ratings yet
Convolutional Neural Network (CNN) Architectures - GeeksforGeeks
17 pages
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
No ratings yet
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
55 pages
Perl Arrays and Lists Guide
No ratings yet
Perl Arrays and Lists Guide
5 pages
Runge-Kutta Method: Consider First Single First-Order Equation: Classic High-Order Scheme Error (4th Order)
No ratings yet
Runge-Kutta Method: Consider First Single First-Order Equation: Classic High-Order Scheme Error (4th Order)
17 pages
The Genius Guide To - Divine Archetypes
100% (1)
The Genius Guide To - Divine Archetypes
18 pages
Lifting Eye Bolts B18.15
No ratings yet
Lifting Eye Bolts B18.15
2 pages
Wearable Devices For The Detection of Covid-19
No ratings yet
Wearable Devices For The Detection of Covid-19
21 pages
Kelly Strategy for Investors
50% (2)
Kelly Strategy for Investors
7 pages
Are Today's Teenagers Smarter and Better Than We Think - The New York Times
No ratings yet
Are Today's Teenagers Smarter and Better Than We Think - The New York Times
5 pages
USPCAS-E Manual
No ratings yet
USPCAS-E Manual
119 pages
Anthony 8
No ratings yet
Anthony 8
2 pages
Artistic Skills and Techniques To Contemporary Art Creations
No ratings yet
Artistic Skills and Techniques To Contemporary Art Creations
40 pages
Lecture O03: ENGR90024 Computational Fluid Dynamics
No ratings yet
Lecture O03: ENGR90024 Computational Fluid Dynamics
43 pages
Ann Cum Syllabus AP English 10-04-2025 1
No ratings yet
Ann Cum Syllabus AP English 10-04-2025 1
5 pages
Chapter 4 (Answers)
No ratings yet
Chapter 4 (Answers)
5 pages
Principles of Assessment: Prepared By: Julie G. de Guzman Eps - I Science
No ratings yet
Principles of Assessment: Prepared By: Julie G. de Guzman Eps - I Science
25 pages
Total 207 212 27 51 Grand Total
No ratings yet
Total 207 212 27 51 Grand Total
20 pages
Action Plan For NLC
No ratings yet
Action Plan For NLC
9 pages
ES Alcoholic Beverages
No ratings yet
ES Alcoholic Beverages
10 pages
CONTOH SKRIPSI (Analisa Penempatan Shear Wall)
No ratings yet
CONTOH SKRIPSI (Analisa Penempatan Shear Wall)
61 pages
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
No ratings yet
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
65 pages
A Review On Artabotrys Odoratissimus (Annonaceae) : Saritha Kodithala and R Murali
No ratings yet
A Review On Artabotrys Odoratissimus (Annonaceae) : Saritha Kodithala and R Murali
3 pages
Software Requirements Specification (SRS)
No ratings yet
Software Requirements Specification (SRS)
5 pages
PFC 4197
No ratings yet
PFC 4197
114 pages
Button
No ratings yet
Button
11 pages
Business Plan Zulkifli Collection
No ratings yet
Business Plan Zulkifli Collection
58 pages
Aspiring Entrepreneur's CV
No ratings yet
Aspiring Entrepreneur's CV
4 pages
Chapter 6 - Multiphase Systems: CBE2124, Levicky
No ratings yet
Chapter 6 - Multiphase Systems: CBE2124, Levicky
27 pages
Ci Driver Do Motor Do CD Rom Datasheet
No ratings yet
Ci Driver Do Motor Do CD Rom Datasheet
11 pages
1.introduction To Surveying
No ratings yet
1.introduction To Surveying
10 pages
The Sigma Guidelines-Toolkit: Sigma Opportunity and Risk Guide
No ratings yet
The Sigma Guidelines-Toolkit: Sigma Opportunity and Risk Guide
21 pages

Lecture05 DeepLearningCNN Trang 2

Uploaded by

Lecture05 DeepLearningCNN Trang 2

Uploaded by

Specialized Convolutional Layers: Motivation

• Computational Cost of Standard Convolutions: Standard convolutions can be computationally

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 40 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 41 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 42 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 43 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 44 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 45 / 84

Groups, Depthwise, and Depthwise-Separable Convolution

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 46 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 47 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 48 / 84

• LeNet-5 consists of seven layers (excluding the input):

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 49 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 50 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 51 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 52 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 53 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 54 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 55 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 56 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 57 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 58 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 59 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 60 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 61 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 62 / 84

• Shift Towards Efficient Architectures: Influenced the development of more efficient

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 63 / 84

• Evolution of Inception: Inception-v3 is the third iteration of the Inception architecture,

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 64 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 65 / 84

• Further Factorization: Inception-v3 further factorizes convolutions by using asymmetric

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 66 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 67 / 84

• State-of-the-Art Performance: Achieved even better performance on ImageNet

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 68 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 69 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 70 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 71 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 72 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 73 / 84

• Key Idea: Focuses on extreme computational

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 74 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 75 / 84

• Key Idea: Introduces channel-wise attention mechanisms to dynamically recalibrate channel-wise

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 76 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 77 / 84

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 78 / 84

A good neural network has a high accuracy and is fast.

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 81 / 84

# Load pre-trained model (ResNet50V2 in this example)

# Freeze the base model layers

# Add custom classification head

model = Model(inputs=base_model.input, outputs=predictions)

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 82 / 84

# Train the model

# Access training history (outside the try block)

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 83 / 84

# Save the model

Thien Huynh-The - HCMUTE Convolutional Neural Networks February 10, 2025 84 / 84

You might also like