Deep Learning
Deep Learning
MANAGEMENT DE NABEUL
Prepared by:
Safa BEN OTHMEN
Supervised by :
2024 - 2025
RESUME
This work is part of our mini-project at ITBS for the 2024-2025 academic
year. The objective of this project is to develop a brain tumor detection
application. For its implementation, we have chosen to use the Streamlit
framework, the Brain Tumor MRI dataset, and CNN with VGG16 transfer
learning models. This document provides a detailed description of the
different stages of the project implementation.
General Introduction 1
1 Project Context 2
Project Context 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Proposed solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Working Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4.1 Proposed Methodology for Brain Tumor Classification . . . . . . . . . . . . 3
1.4.1.1 Dataset Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4.1.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4.1.3 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4.1.4 Transfer Learning and Model Evaluation . . . . . . . . . . . . . . 4
1.4.1.5 Model Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.1.6 Classification and Outcome . . . . . . . . . . . . . . . . . . . . . 4
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
ii
3 Development 10
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Data splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 Basic CNN Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1.1 Architecture Description . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Data Augmentation to Improve Generalization . . . . . . . . . . . . . . . . 13
3.3.3 Use of the Pretrained VGG16 Model . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Conclusion 25
Bibliographie 26
iii
Table des figures
iv
Liste des tableaux
v
General Introduction
The accurate diagnosis of brain tumors from MRI scans is a critical yet time-
consuming challenge in clinical practice, requiring advanced expertise to interpret
complex imaging patterns. This research addresses the growing need for reliable
computer-aided diagnostic systems by developing deep learning models for automa-
ted tumor classification.
Using a dataset of 7,000 annotated MRI scans covering four tumor categories
(glioma, meningioma, pituitary, and no tumor), we implement and compare multiple
neural network architectures. The study begins with a custom-designed convolutio-
nal neural network (CNN) to establish baseline performance, then enhances model
robustness through systematic data augmentation techniques. We further optimize
detection accuracy by adapting the pre-trained VGG16 model via transfer learning,
demonstrating how existing architectures can be specialized for medical imaging
tasks
The structure of this project is organized into four main chapters as follows :
1
Chapitre 1
Project Context
1.1 Introduction
The objective of this chapter is to present the problem that this project aims to
solve in the first section, then discuss the objectives we aim to achieve. Then, we
will propose solutions. Finally, we will outline the methodology chosen for the de-
velopment and deployment of our solution.
2
CHAPITRE 1. PROJECT CONTEXT 3
dology has been adopted to ensure an efficient and structured development process.
The proposed methodology follows a sequence of essential steps, starting from
data acquisition and ending with the final classification outcome. The different phases
are detailed below :
The dataset used consists of brain MRI images, including four categories : glioma
tumor, pituitary tumor, meningioma tumor, and no tumor.
To increase the diversity of the training set and prevent overfitting, data augmen-
tation techniques such as rotation, flipping, zooming, and shifting were applied. This
step helps the model generalize better to unseen data.
CHAPITRE 1. PROJECT CONTEXT 4
Given the limited size of the dataset, transfer learning was employed using pre-
trained convolutional neural networks (CNNs). The selected model was fine-tuned
and evaluated on the prepared dataset to leverage the pre-learned features and adapt
them to the specific task.
During this phase, the fine-tuned model is trained on the augmented dataset. Hy-
perparameters were optimized to improve the learning process, and model evaluation
was conducted iteratively to monitor progress and adjust parameters as needed.
Finally, the trained model is used to classify new MRI images into one of the four
categories : glioma, pituitary tumor, meningioma tumor, or normal. The model’s per-
formance was assessed using appropriate metrics such as accuracy, precision, recall,
and F1-score.
1.5 Conclusion
In this chapter, we outlined the problem and the objective to be achieved for our
project, and proposed our solution. Finally, we presented the choice of the methodo-
logy we have adopted.
Chapitre 2
2.1 Introduction
Once the methodology is chosen,In this chapter, we present the different existing
techniques used for brain tumor detection from MRI image
5
CHAPITRE 2. STATE OF THE ART 6
Recurrent neural networks have connections that form directed cycles. This al-
lows the outputs of the LSTM to be used as inputs at the current phase. The output
of the LSTM becomes an input for the current phase. It can therefore remember
previous inputs using its internal memory. In practice, RNNs are used for image
captioning , natural language processing, and machine translation.
LSTMs are derivatives of RNNs. They can learn and memorize dependencies over
a long period of time. LSTMs thus retain memorized information over the long term.
They are particularly useful for predicting time series data because they remember
previous inputs. Besides this use case, LSTMs are also used for composing musical
notes and recognizing voices.
The following are definitions of different layers shown in the above architecture :
Convolutional Layers
Convolutional layers operate by sliding a set of filters or kernels across the input data.
Each filter is designed to detect specific features or patterns, such as edges, corners,
or more complex shapes in deeper layers. As these filters move across the image,
they generate a feature map highlighting where particular features were detected.
In simple terms, convolutional layers are responsible for extracting features from
the input images.
Pooling Layers
Pooling layers follow convolutional layers and are used to reduce the spatial dimen-
sions (width and height) of the input. An image can be thought of as a grid of pixels ;
reducing spatial dimensions helps decrease the number of parameters and computa-
tion in the network.
Pooling layers help to prevent overfitting and enable the model to train faster and
more efficiently.
Output Layer
CHAPITRE 2. STATE OF THE ART 8
The output layer in a CNN is crucial, as it produces the final output of the network,
either for classification or regression tasks.
• Transformation of Features to Final Output : The earlier layers extract and
abstract features from the input ; the output layer transforms these features into
a final, interpretable result.
• Task-Specific Formulation :
– For classification tasks, the output layer typically uses a softmax activa-
tion function to convert the features into a probability distribution over the
predefined classes, ensuring that the probabilities sum to 1.
– For regression tasks, the output layer usually consists of one or more neu-
rons with a linear (or no) activation function, producing continuous output
values.
2.3.4.1 Definition
2.3.4.3 VGG16
2.3.4.4 ResNet
ResNet introduit des blocs résiduels qui permettent d’avoir des réseaux très pro-
fonds tout en évitant le problème de gradient évanescent. ResNet peut avoir des
centaines, voire des milliers de couches.
2.3.4.5 MobileNet
In our study, we observed that the architecture of VGG-16 proved to be the most
efficient for the medical image classification task, achieving high performance com-
pared to the architectures of VGG-19 and ResNet50. This was true across different
image contrast states, whether the images were in their normal state or enhanced by
the CLAHE technique
2.4 Conclusion
In this chapter, we provided an overview of deep learning, highlighting the lear-
ning techniques used in our application.
Chapitre 3
Development
3.1 Introduction
This chapter details the data processing and modeling approach for brain tumor
detection from MRI scans. We cover dataset preparation, CNN architecture design,
and performance enhancement techniques including data augmentation and transfer
learning with VGG16.
10
CHAPITRE 3. DEVELOPMENT 11
• Training Set (Train) : Represents 70% of the data and is used to train the
model.
• Test Set (Test) : Consists of the remaining 20% and This subset is used to
evaluate the final performance of the model.
• Validation Set :Represents 10% ,This subset is further split from the training
data to tune the model’s hyperparameters and prevent overfitting.
CHAPITRE 3. DEVELOPMENT 12
3.3 Modeling
To accomplish the task of brain tumor detection from MRI images, several deep
learning models were developed and evaluated. Initially, a basic CNN model was
implemented to establish a baseline performance. Subsequently, data augmentation
techniques were applied to enhance the model’s robustness. Finally, a Transfer Lear-
ning approach was explored using the pre-trained VGG16 model
Once our data was prepared, it was time to present the architecture of our CNN
model.
Applied Transformations
Expected Impact :
Transfer Learning refers to using a model that has been pretrained on a large
dataset to solve a similar problem with a new dataset. This approach helps reduce
training time and enhances performance, especially when working with a limited
amount of data.
In the context of brain tumor detection from medical images, the idea is to le-
verage a model already trained to classify general objects (like VGG16, which was
trained on ImageNet) and adapt it to detect brain tumors.
Why VGG16 ?
VGG16 is a deep convolutional neural network (CNN) model developed by the
Visual Geometry Group at the University of Oxford. It was trained on the ImageNet
dataset, which contains over 1 million images across 1000 categories. This model has
demonstrated an excellent ability to extract high-level features from images, making
it highly suitable for transfer learning.
In the case of brain tumor detection, we will adapt VGG16 so that it can identify
tumors from specific medical images
Steps for Adapting the VGG16 Model for Brain Tumor Detection :
1. Data Preparation Before adapting the VGG16 model, it is essential to properly
prepare the brain images. Here are the key steps involved in data preparation :
3.4 Conclusion
We presented a complete pipeline for brain tumor classification, from data split-
ting to advanced modeling. The next chapter will evaluate these models’ perfor-
mance. The next step involves evaluating and comparing the performance of each
model.
Chapitre 4
4.1 Introduction
Model evaluation in deep learning is essential for selecting optimal architectures
and ensuring clinical applicability. This chapter presents a comparative analysis of
our CNN models and their deployment
this figure demonstrates the progressive decrease of cross-entropy loss during trai-
ning and validation, indicating stable learning :
17
CHAPITRE 4. EVALUATION AND DEPLOYMENT 18
4.2.1.3 Accuracy
Accuracy is a metric that measures how often a machine learning model correctly
predicts the outcome. Here are the accuracy results for the Basic CNN Model
Before deploying the app,, we must first save the architecture of our model in a
format that can be loaded later to make predictions on new data.
After preparing the model, we can move on to building the Streamlit application.
The application will include the following features :
User input : The user provides a brain image through the Streamlit interface.
Prediction display : The model predicts and displays the corresponding category.
taken to deploy the application on the chosen platform and the tools used in the
process.
4.4 Conclusion
This chapter presented the evaluation and deployment of our brain tumor detec-
tion system. After comparing multiple deep learning architectures, our custom CNN
achieved superior performance (95% accuracy) over pre-trained models like VGG16.
The model was successfully deployed as a Streamlit web application,
Conclusion
Diagnosing a disease in a healthy patient does not have the same consequences
as predicting the health of a sick individual. In the first case, unnecessary treatments
might be administered, or additional tests might be requested, leading to costs and
inconvenience. In contrast, in the second case, the lack of diagnosis and appropriate
treatment could lead to rapid and irreversible deterioration of the patient’s health.
This issue highlights the importance of the precision and reliability of prediction
models in medicine, especially for chronic diseases like diabetes. Therefore, to avoid
diagnostic errors, studying and applying data classification techniques in medicine is
crucial.
In the context of this project, we applied the CRISP-DM methodology, which al-
lowed us to follow a structured process, from understanding the application domain
(diabetes) to model deployment. This methodology guided each step of the project,
including data collection, cleaning, exploratory analysis, modeling, model perfor-
mance evaluation, and finally, deployment. Using an SVM (Support Vector Machine)
model, we were able to predict whether a person is likely to develop diabetes.
25
Bibliographie
26