0% found this document useful (0 votes)

12 views33 pages

Introduction To Machine Learning

The document provides an overview of Machine Learning (ML), defining it as a subfield of artificial intelligence that enables computers to learn from data without explicit programming. It discusses various aspects of ML, including types of learning (supervised, unsupervised, reinforcement), data handling, performance measures, and applications in fields like computer vision and natural language processing. Additionally, it addresses challenges such as overfitting, data augmentation, and the prerequisites for understanding ML.

Uploaded by

achouriarij59

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views33 pages

Introduction To Machine Learning

Uploaded by

achouriarij59

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Machine learning

Introduction

Mohamed FARAH

Année universitaire : 2024-2025

Machine Learning

 Machine Learning is the field of study that gives the

computer the ability to learn without being explicitly
programmed (Arthur Samuel (1959)
Machine Learning

 Machine Learning is:

 a subfield of artificial intelligence based on mathematical and
statistical approaches to empower computers to learn from data
 automatically resolves decision problems without explicit
programming
 relates to the design, optimisation and implementation of methods to
learn from past data in order to predict new observations

Machine Learning – new programming paradigm

Traditional Programming

Data
Computer Output
 Data driven Program

 Automating automation
 Getting computers to Machine Learning
programme themselves
Data
Computer Program
Output

4
Machine Learning

 A computer program is said to learn from experience E with respect to

some class of tasks T and performance measure P, if its performance
at tasks in T, as measured by P, improves with experience E (Tom
Mitchell, 1998)

 Example :
• Experience (data): games played by the program (with itself) Tom Mitchell

• Performance measure: winning rate

 Learning is the acquisition of the ability to perform the task

 How to learn is another type of problem and there are many methods

The Experience E / Data

 Most algorithms experience an entire dataset

 Dataset: A collection of examples, aka data points
 An example is a collection of features (data) that have been
quantitatively measured for some object/event that we
want the ML system to process

6
Data – Example
 Anderson’s Iris data (oldest dataset, in stat/ML 1936)

• Measurements of 150 iris flowers

- 4 attributes : sepal length, sepal width, petal length, petal width ∈ ×

• 3 species: Setosa, versicolor, virginica

https://en.wikipedia.org/wiki/Iris_flower_data_set

Data as vectors, matrices, tensors

 Tensors: generalization of matrices

to n dimensions (or rank, order, degree)
• 1D tensor: vector
• 2D tensor: matrix
• 3D, 4D, 5D tensors
Data
 Datasets decomposition
• Training set : data to train with
• Validation set : when to stop training
• Test or generalisation set : data to test on

 These datasets are all disjoint

Dataset Assumptions

 data are generated by a probability distribution

over the data
 Typically make i.i.d assumptions
• Samples are independent from each other
• Training and test sets are identically distributed
(drawn from the same distribution)
The Task T

 ML enables tackling tasks too difficult to solve with fixed

programs written and designed manually
 T is usually described in terms of how the machine learning
system should process an example
 NB. The process of learning itself is not the task

The Performance Measures, P

 P are specific to the task T

 Well known measures based on the confusion matrix

Accuracy
Precision
Recall
F-score
etc.

! Applied on data not seen before:

Test set ... not the training set

12
After the task is learned

 Processing of new data is called inference

 Computational costs during training (high) vs inference (lower)

Related Domains

 Statistics: learning theory, data mining, inference

 Computing: AI, computer vision, IR
 Engineering: signal, robotics, control
 Cognitive science, psychology, epistemology, neuroscience
 Economics: decision theory, game theory

14
Applications of Machine Learning

Computer Vision

 Image recognition, segmentation, classification, etc.

Model Cat or Dog

 Example : Recognition of handwritten characters

16
Computer Vision

 Example : Face detection

Computer Vision

 Example : Detection of pedestrians

Example of training images

18
Natural Language Processing (NLP)

 Example : Classification of Textual Documents.

Natural Language Processing (NLP)

 Example : Detection of spams in the emails.

Hint: Count the frequency and co-occurrence of certain keywords, e.g.

congratulations, lottery, win, prize, etc.

20
Natural Language Processing (NLP)

 Example : Automatic Translation.

“How are you?” Model “Wie geht’s dir?”

Translating machine

Natural Language Processing (NLP)

 Example : Recommendation Systems.

22
Natural Language Processing (NLP)

 Example : Chatbots

“How are you?” Model ‘I am fine thank you’

Conversational agent / chatbot

Bio-Informatics

 Sequence alignment, analysis of genetic data, etc.

 Example : Prediction of Caesarean Emergency

Conditions

24
Signal processing
 Speech recognition, person identification, speech to text, text to
speech, etc.

Model ‘Hello’

Speech recognition

Other areas of application

 Robotics: estimation of positions, of states, etc.

 Financial analysis: allocation of portfolio, credits, grants, etc.
 Medicine: diagnosis, treatment, design of therapies, etc.
 Graphic design: realistic design and simulations, etc.
 Social networks
 Content generation
 etc.

26
Learning Types

(based on tasks)

Learning Types

Supervised Unsupervised
Learning Learning

Reinforcement
Learning

28
Supervised learning

Supervised learning
 Given: a dataset that contains samples
1 , 1 ,…( , )
 Task: if a residence has square feet, predict its price?

15th sample
( 15 , 15 )

= 800
=?
Housing price prediction
Supervised learning
 Given: a dataset that contains samples
1 , 1 ,…( , )
 Task: if a residence has square feet, predict its price?

= 800
=?
Housing price prediction

Regression vs Classification
 regression: if ∈ ℝ is a continuous variable
 e.g., price prediction

 classification: the label is a discrete variable

 e.g., the task of predicting the types of residence

(size, lot size) → house or townhouse?

= house or
townhouse?
Supervised Learning – Model Types

2 types of models:

• Discriminative model:
• it is estimated that ( | )
• we're learning the decision boundary
• Generative model:
• it is estimated that ( │ ) is used to deduce ( | )
• we learn probability distributions of data 33

Supervised learning in Computer Vision

 Image Classification
 = raw pixels of the image, = the main object

ImageNet Large Scale Visual Recognition Challenge. Russakovsky et al.’2015

Supervised learning in Computer Vision

 Object localization and detection

 = raw pixels of the image, = the bounding boxes

ImageNet Large Scale Visual Recognition Challenge. Russakovsky et al.’2015

Supervised learning in Computer Vision

 Recognition of handwritten characters (OCR)

: values of intensities of pixels of the image.
: identity of the character (class).

36
Supervised learning in NLP
 Machine translation

Unsupervised learning

Also called Knowledge discovery

38
Unsupervised Learning

 Dataset contains no labels: 1 ,…

 Target is not explicitly known
 Goal (vaguely-posed): to find interesting structures /
patterns in the data

supervised unsupervised

Clustering

 k-mean clustering, mixture of Gaussians, etc.

Clustering

 k-mean clustering, mixture of Gaussians, etc.

Density Estimation

 learning the probability distribution having generated the

data.
• To generate new realistic data.
• To distinguish “realistic” data from “false” data (e.g. spam
filtering).
• Compression of data
• etc.

42
Density Estimation

 given a sample = , = 1. . from a distribution,

 obtain an estimate of the density function at any point.
 Parametric:
• Assume a parametric family of densities . ! (e.g., (", # $ )) and obtain
the best estimate !% of !
 Nonparametric:
• Obtain a good estimate of the
entire density directly from
the sample (e.g. Histogram)

Representation learning

 automatically extracting useful and significant characteristics

from raw data without labels.

 The aim is to transform the data into a more compact and

informative representation (embeddings) that facilitates
subsequent tasks such as classification or grouping.

44
Word Embedding

Rome
Represent words by vectors
Paris
encode Italy
 word vector
Berlin
encode
 relation direction France
Germany

Word2vec [Mikolov et al’13]

GloVe [Pennington et al’14]

Clustering Words with Similar Meaning

(Hierachically)

[Arora-Ge-Liang-M.-Risteski, TACL’17,18]
Dimensionality reduction

 reduce the number of variables or dimensions of the data,

while preserving the essential information.

“swiss roll”
dataset

47
https://link.springer.com/article/10.1007/s00477-016-1246-2/figures/1

Latent Semantic Analysis (LSA)

documents
words

 Principal Component Analysis (PCA) used in LSA

https://commons.wikimedia.org/wiki/File:Topic_
detection_in_a_document-word_matrix.gif
Large Language Models (LLM)
 machine learning models for language learnt on large-
scale language datasets
 can be used for many purposes

Language Models are Few-Shot Learners [Brown et al.’20]

https://openai.com/blog/better-language-models/

Reinforcement learning

50
Reinforcement learning

 Learning to make sequential decisions

 Chess
• 1997: Deep Blue (IBM) defeated world chess champion Garry Kasparov
in a six-game match.
• 2017: AlphaZero (DeepMind) defeated Stockfish (chess engine)

 Go
• 2016: AlphaGo (DeepMind) defeated 18-time world champion Lee
Sedol 4-1 in a five-game match.
• 2017: AlphaGo Master defeated world champion Ke Jie
• 2017: AlphaGo Zero (a more advanced version) surpassed all previous
versions

Reinforcement learning

 The algorithm can collect data interactively

Try the strategy and Data Improve the strategy

collect feedbacks
Training based on the
collection
feedbacks
Reinforcement learning

 Problem Data
 A state describes a situation
 An action allows you to switch between states
 A policy allows you to choose the action to be taken based on your
current state
 At the end of each action, a + or - reward is observed

 Objectives
 Guide an agent to define a policy: Improve the policy of choice of
action at time t+1
 Avoid failure situations
53

Challenges

 The ability to generalise a model

 Overfiting
 Underfitting

 Curse of dimensionality (Lots of features vs dataset size)

 Vanishing and exploding gradients (in Neural Networks
based models)
 Data not available
 Data augmentation (Reduced datasets)
 Imbalanced datasets
 etc. 54
Generalisation
 a major challenge of ML
• Ability to perform well on previously unseen outputs

 training error vs test / generalisation error

• training error: error on training input

• test / generalisation error: expected error on a new input

 ML training algorithm reduces training error, which is the task

of optimisation

 What differentiates ML from pure optimisation is that the test /

generalisation error needs to be low as well

Typical learning curve

Validation Loss

Training loss

Number of training steps

Overfitting
• A major problem for the learning techniques!

• One can find a hypothesis that makes a good prediction for

training data, but that does not generalise well for the rest
of the data.

• In the rest of the course, we will see methods to

mitigate the overfitting problem.

Vanishing and exploding gradients problem

 For Neural Networks based models

• Vanishing Gradients: Occur when the gradients of the loss
function with respect to the parameters become very small
during backpropagation. This prevents the weights from
updating effectively, slowing or halting learning, especially
in early layers of deep networks.
• Exploding Gradients: Occur when the gradients become
very large, causing unstable updates to the weights and
making training diverge.
Vanishing and exploding gradients problem

Data augmentation

What ?
 increase the size and diversity of a training dataset
 apply various transformations to the original data
 used when the original dataset is small or lacks diversity.
Why ?
 Prevents overfitting by exposing the model to more varied
data.
 Improves the model's ability to generalize to unseen data.
 Enhances performance in tasks like image classification,
object detection, natural language processing, etc. 60
Data augmentation

Common Techniques:
1. Image Data:
1. Rotation, flipping, cropping, scaling, and translation.
2. Color jittering (adjusting brightness, contrast, saturation).
3. Adding noise or blurring.
4. Random erasing or cutout.
2. Text Data:
1. Synonym replacement, random insertion, or deletion of words.
2. Back-translation (translating text to another language and back).
3. Shuffling sentences or phrases.
3. Audio Data:
1. Time stretching, pitch shifting, or adding background noise.
4. Tabular Data:
1. Adding noise to numerical features.
61
2. Synthetic minority oversampling techniques (e.g., SMOTE).

Data augmentation

 Examples for Image Data

62
Imbalanced datasets

 Skewed class distributions can lead to biased

models that favor the majority class.
 Common Techniques: Resampling Techniques
• Oversampling:
- Increase the number of instances in the minority
class.
- Examples: Random oversampling, SMOTE
(Synthetic Minority Oversampling Technique),
ADASYN.
• Undersampling:
- Reduce the number of instances in the majority
class.
- Examples: Random undersampling, Tomek links,
Cluster Centroids.
• Hybrid Approaches:
- Combine oversampling and undersampling for
balanced results.

Prerequisites

• Knowledge in numerical analysis: derivation calculation,

partial derivative, gradient, integral, etc.

• Knowledge of linear algebra: matrix, vector, norm, scalar

product, etc.

• Knowledge in probabilities & statistics

• Knowledge of programming

64
References
• A. Geron. Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow Concepts, Tools, and Techniques to
Build Intelligent Systems. O'Reilly Media Inc., 2019.
• C. Bishop. Pattern Recognition and Machine Learning.
Springer 2006.
• R. Duda, P. Storck and D. Hart. Pattern Classification.
Prentice Hall, 2002).
• ...

cp4252 Machine Learning
100% (2)
cp4252 Machine Learning
49 pages
Motor Vehicle Sale Agreement
100% (1)
Motor Vehicle Sale Agreement
3 pages
DL Unit 1 Notes
No ratings yet
DL Unit 1 Notes
90 pages
Unit-1 - Session 1 - Supervised & Unsupervised PDF
No ratings yet
Unit-1 - Session 1 - Supervised & Unsupervised PDF
53 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Machine Learning Syllabus Overview
No ratings yet
Machine Learning Syllabus Overview
70 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
Mlintro 3
No ratings yet
Mlintro 3
28 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
Mlintro 4
No ratings yet
Mlintro 4
28 pages
01 Introduction
No ratings yet
01 Introduction
28 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
1 Introduction
No ratings yet
1 Introduction
24 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
Module 1
No ratings yet
Module 1
175 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
ML Unit 1 Intro ML
No ratings yet
ML Unit 1 Intro ML
43 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
ML Important
No ratings yet
ML Important
8 pages
U1 ML Intro and Applications
No ratings yet
U1 ML Intro and Applications
123 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Week 01
No ratings yet
Week 01
37 pages
Unit 1
No ratings yet
Unit 1
93 pages
Chap 1 Introduction To ML
No ratings yet
Chap 1 Introduction To ML
33 pages
ML Notes Unit-1
No ratings yet
ML Notes Unit-1
11 pages
Lecture01 Introduction To Machine Learning (Chapter1)
No ratings yet
Lecture01 Introduction To Machine Learning (Chapter1)
64 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
MCA - ML Question Bank Answer
No ratings yet
MCA - ML Question Bank Answer
139 pages
Intro to Machine Learning Concepts
100% (1)
Intro to Machine Learning Concepts
58 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
CS3491-AI ML-Chapter 1
No ratings yet
CS3491-AI ML-Chapter 1
19 pages
Unit I Machine Learning
No ratings yet
Unit I Machine Learning
78 pages
LN ML Rug
No ratings yet
LN ML Rug
267 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
39 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
71 pages
1 Sup
No ratings yet
1 Sup
80 pages
01-Introduction To Machine Learning
No ratings yet
01-Introduction To Machine Learning
89 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Module1 - Deep Learning
No ratings yet
Module1 - Deep Learning
26 pages
Machine Learning: Upendra Verma
No ratings yet
Machine Learning: Upendra Verma
34 pages
Dhruv Arora - Resume
No ratings yet
Dhruv Arora - Resume
1 page
Bungalow Melody - Lyrics - Chords
No ratings yet
Bungalow Melody - Lyrics - Chords
1 page
CRD-L: Direct Acting Pressure Reducing Valve
No ratings yet
CRD-L: Direct Acting Pressure Reducing Valve
4 pages
Sts Benigno Aquino III
No ratings yet
Sts Benigno Aquino III
3 pages
Tax Updates for Business Owners
No ratings yet
Tax Updates for Business Owners
61 pages
Self-Reflection On Instructional Coaching (1) 2
No ratings yet
Self-Reflection On Instructional Coaching (1) 2
3 pages
Master Study & Visa Guide Germany
100% (1)
Master Study & Visa Guide Germany
6 pages
PORT AND TERMINAL INFORMATION BOOK-Ver 3 1 - 18 12 13
No ratings yet
PORT AND TERMINAL INFORMATION BOOK-Ver 3 1 - 18 12 13
21 pages
Designz Tweet Book
No ratings yet
Designz Tweet Book
117 pages
PT - 1 Apr 2025
No ratings yet
PT - 1 Apr 2025
4 pages
Design of Linear Quadratic Regulator For Rotary Inverted Pendulum Using Labview
No ratings yet
Design of Linear Quadratic Regulator For Rotary Inverted Pendulum Using Labview
5 pages
BBA Students: Globalization Insights
No ratings yet
BBA Students: Globalization Insights
4 pages
The Theoretical Framework of The Optimization of Public Transport Travel
No ratings yet
The Theoretical Framework of The Optimization of Public Transport Travel
7 pages
Bro vd10 20140115
No ratings yet
Bro vd10 20140115
2 pages
MAEF636850781708236636 EOI Seekho Aur Kamao 18-19
No ratings yet
MAEF636850781708236636 EOI Seekho Aur Kamao 18-19
13 pages
Chem-Project 1
No ratings yet
Chem-Project 1
4 pages
Kappu Potet'o Brief Background of The Business
No ratings yet
Kappu Potet'o Brief Background of The Business
3 pages
Affidavit for Name Discrepancy Correction
No ratings yet
Affidavit for Name Discrepancy Correction
5 pages
Kioxia SSD XG6-P Product Brief
No ratings yet
Kioxia SSD XG6-P Product Brief
2 pages
1.2. Free Radical Bromination of Alkanes - Master Organic Chemistry
No ratings yet
1.2. Free Radical Bromination of Alkanes - Master Organic Chemistry
1 page
Inverse Kinematics of Redundant Robots Using Genetic Algorithms
No ratings yet
Inverse Kinematics of Redundant Robots Using Genetic Algorithms
6 pages
Pharmacy Lab Instrument Guide
No ratings yet
Pharmacy Lab Instrument Guide
8 pages
SIGVERIF
No ratings yet
SIGVERIF
7 pages
CaseStudy Ch8 (3) Eng
No ratings yet
CaseStudy Ch8 (3) Eng
2 pages
Quality Plan
No ratings yet
Quality Plan
1 page
8-Step Guide to Effective Gemba Walks
No ratings yet
8-Step Guide to Effective Gemba Walks
10 pages
Acquiring Skills in Basketball Through Observational Learning
No ratings yet
Acquiring Skills in Basketball Through Observational Learning
19 pages
Polity (Articles Compilation June2024-Jan2025) M IE Explained - All Subjects (Dec 2025)
No ratings yet
Polity (Articles Compilation June2024-Jan2025) M IE Explained - All Subjects (Dec 2025)
23 pages
How To Improve Your Apache Web Server's Performance?
No ratings yet
How To Improve Your Apache Web Server's Performance?
2 pages