0% found this document useful (0 votes)

22 views28 pages

Lecture3 Transfer Learning

The document outlines an introductory course on Applied Machine Learning led by Dr. Tao Han at the New Jersey Institute of Technology. It covers key concepts such as transfer learning, model fine-tuning, multitask learning, domain adaptation, and zero-shot learning, emphasizing their applications and challenges. The course is designed to equip students with practical skills in training classifiers and adapting models to different tasks and domains.

Uploaded by

ra734

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views28 pages

Lecture3 Transfer Learning

Uploaded by

ra734

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

ECE 498: ST:

Introduction to Applied Machine Learning

• Tao Han, Ph.D.

• Associate Professor
• Electrical and Computer Engineering
• Newark College of Engineering
• New Jersey Institute of Technology

• https://tao-han-njit.netlify.app

Slides are designed based on Prof. Hung-yi Lee’s Machine Learning courses at National Taiwan University
http://weebly110810.weebly.com/3
96403913129399.html
http://www.sucaitianxia.com/png/c
Transfer Learning artoon/200811/4261.html

Dog/Cat
Classifier
cat dog

Data not directly related to the task considered

elephant tiger dog cat

Similar domain, different tasks Different domains, same task

Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled
Target Data

unlabeled

Warning: different terminology in different literature

Model Fine-tuning
One-shot learning: only a few
examples in target domain
• Task description
• Source data: 𝑥 𝑠 , 𝑦 𝑠 A large amount
• Target data: 𝑥 𝑡 , 𝑦 𝑡 Very little
• Example: (supervised) speaker adaption
• Source data: audio data and transcriptions from many
speakers
• Target data: audio data and its transcriptions of specific
user
• Idea: training a model by source data, then fine-
tune the model by target data
• Challenge: only limited target data, so be careful about
overfitting
Conservative Training
Output layer output close Output layer

parameter close

initialization
Input layer Input layer

Target data (e.g.

Source data
A little data from
(e.g. Audio data of
target speaker)
Many speakers)
Layer Transfer
Output layer Copy some parameters

Target data

Input layer 1. Only train the rest layers (prevent

Source
data overfitting)
2. fine-tune the whole network (if
there is sufficient data)
Layer Transfer
• Which layer can be transferred (copied)?
• Image: usually copy the first few layers

Pixels Layer 1 Layer 2 Layer L

x1 …… ……
x2 …… elephant

……
……
……
……

xN …… ……
Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled

Multitask Learning
Target Data

unlabeled

Warning: different terminology in different literature

Multitask Learning
• The multi-layer structure makes NN suitable for
multitask learning
Task A Task B
Task A Task B

Input
Input feature Input feature
feature
for task A for task B
Multitask Learning
- Multilingual Speech Recognition
states of states of states of states of states of
French German Spanish Italian Mandarin

Human languages
share some common
characteristics.
acoustic features
Similar idea in translation: Daxiang Dong, Hua Wu, Wei He, Dianhai Yu and
Haifeng Wang, "Multi-task learning for multiple language translation.“, ACL 2015
Multitask Learning - Multilingual
50
Character Error Rate

40 Mandarin
only
35

With
30 European
Language
25
1 10 100 1000

Hours of training data for Mandarin

Huang, Jui-Ting, et al. "Cross-language knowledge transfer using multilingual
deep neural network with shared hidden layers." ICASSP, 2013
Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled

Multitask Learning
Target Data

Domain-adaptation
unlabeled

Warning: different terminology in different literature

You have learned a lot about ML. Training a classifier is
not a big deal for you. ☺
Training
Data

Testing
Data
99.5% 57.5%
The results are from: http://proceedings.mlr.press/v37/ganin15.pdf

Domain shift: Training and testing data have different

distributions. Domain adaptation
Domain Shift
Training Data Testing Data

Source Target
Domain Domain

1 2 3 4 5 1 2 3 4 5

This is “0”. This is “1”.

Source Domain
Domain Adaptation (with labeled data)

“4” “0” “1”

Knowledge of target domain

• Idea: training a model by source data,

“8”
then fine-tune the model by target data
• Challenge: only limited target data, so be Little but
careful about overfitting labeled
Source Domain
Domain Adaptation (with labeled data)

“4” “0” “1”

Knowledge of target domain

“8”

Large amount of Little but

unlabeled data labeled
Basic Idea Learn to ignore colors
Feature
Extractor feature
(network)

Source
The same
Different
distribution
Target

Feature
Extractor feature
(network)
Domain Adversarial Training
image class distribution

Feature Label
“4”
Extractor Predictor

Source
(labeled)
blue points
Target
(unlabeled)
red points
Domain Adversarial Training
𝜃𝑓∗ = min 𝐿 − 𝐿𝑑 always zero?
𝜃𝑓
𝜃𝑓 𝜃𝑝
Feature Label
“4”
Extractor Predictor
𝐿
Generator 𝜃𝑝∗ = min 𝐿
𝜃𝑝

• Feature extractor: Learn 𝜃𝑑∗ = min 𝐿𝑑

𝜃𝑑
𝐿𝑑
to “fool” domain classifier 𝜃𝑑
Domain Source?
• Also need to support Classifier Target?
label predictor
Discriminator
Domain Adversarial Training
Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation,
ICML, 2015
Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand,
Domain-Adversarial Training of Neural Networks, JMLR, 2016
class 1 (source) Target data
class 2 (source) (class unknown)
Limitation
Decision boundaries learned
from source domain

Source and target data Target data (unlabeled

are aligned, but …… far from boundary)
Considering Decision Boundary
unlabeled Small entropy

Feature Label
Extractor Predictor
1 2 3 4 5

unlabeled
Large entropy
Feature Label
Extractor Predictor
1 2 3 4 5

Used in Decision-boundary Iterative Refinement Training with

a Teacher (DIRT-T) https://arxiv.org/abs/1802.08735

Maximum Classifier Discrepancy https://arxiv.org/abs/1712.02560

Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled

Multitask Learning
Target Data

Domain-adaptation
unlabeled

Zero-shot learning

Warning: different terminology in different literature

http://evchk.wikia.com/wiki/%E8%8
Zero-shot Learning D%89%E6%B3%A5%E9%A6%AC

• Source data: 𝑥 𝑠 , 𝑦 𝑠 Training data Different

• Target data: 𝑥 𝑡 Testing data tasks

𝑥 𝑠: …… 𝑥𝑡 :

𝑦𝑠: cat dog …… Alpaca

How we solve this problem?

Zero-shot Learning
• Representing each class by its attributes
Training
1 0 0 1 1 1 Database
furry 4 legs tail furry 4 legs tail
attributes
furry 4 legs tail …
Dog O O O
NN NN
class Fish X X O
Chimp O X X
…

sufficient attributes for one

to one mapping
Zero-shot Learning
• Representing each class by its attributes
Testing Find the class with the most
similar attributes
0 0 1
furry 4 legs tail attributes
furry 4 legs tail …
Dog O O O
NN
class Fish X X O
Chimp O X X
…

sufficient attributes for one

to one mapping
𝑓 ∗ and g ∗ can be NN.
Zero-shot Learning Training target:
𝑓 𝑥 𝑛 and 𝑔 𝑦 𝑛 as
close as possible
• Attribute embedding

x2 y1 (attribute
of chimp) y2 (attribute
x1 of dog)

2
𝑓 𝑥2 𝑔 𝑦
𝑓 𝑥1 𝑔 𝑦1

y3 (attribute of 𝑔 𝑦3 𝑓 𝑥3
x3
Alpaca)
Embedding Space
More about Zero-shot learning
• Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton, Tom M.
Mitchell, “Zero-shot Learning with Semantic Output Codes”, NIPS
2009
• Zeynep Akata, Florent Perronnin, Zaid Harchaoui and Cordelia
Schmid, “Label-Embedding for Attribute-Based Classification”,
CVPR 2013
• Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff
Dean, Marc'Aurelio Ranzato, Tomas Mikolov, “DeViSE: A Deep
Visual-Semantic Embedding Model”, NIPS 2013
• Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram
Singer, Jonathon Shlens, Andrea Frome, Greg S. Corrado, Jeffrey
Dean, “Zero-Shot Learning by Convex Combination of Semantic
Embeddings”, arXiv preprint 2013
• Subhashini Venugopalan, Lisa Anne Hendricks, Marcus
Rohrbach, Raymond Mooney, Trevor Darrell, Kate Saenko,
“Captioning Images with Diverse Objects”, arXiv preprint 2016

Transfer (v3)
No ratings yet
Transfer (v3)
38 pages
Transfer Learning: A Comprehensive Survey
No ratings yet
Transfer Learning: A Comprehensive Survey
42 pages
Lecture 11 Transfer and Few-Shot Learning
No ratings yet
Lecture 11 Transfer and Few-Shot Learning
47 pages
Data - and AI-driven Methods in Engineering
No ratings yet
Data - and AI-driven Methods in Engineering
40 pages
11 Deep Transfer Learning and Multi Task Learning
No ratings yet
11 Deep Transfer Learning and Multi Task Learning
24 pages
Lecture 5-7
No ratings yet
Lecture 5-7
62 pages
Unit-V Tranfer Learning Notes
No ratings yet
Unit-V Tranfer Learning Notes
27 pages
4 CS826 - Meta Learning
No ratings yet
4 CS826 - Meta Learning
40 pages
A Survey On Deep Transfer Learning
No ratings yet
A Survey On Deep Transfer Learning
10 pages
Transferability in Deep Learning: A Survey: Junguang Jiang
No ratings yet
Transferability in Deep Learning: A Survey: Junguang Jiang
64 pages
Unit - V
No ratings yet
Unit - V
44 pages
Domain Differential Adaptation For Neural Machine Translation
No ratings yet
Domain Differential Adaptation For Neural Machine Translation
11 pages
Ml-Ii 5
No ratings yet
Ml-Ii 5
5 pages
(Fall 2024) Deep Learning 3
No ratings yet
(Fall 2024) Deep Learning 3
54 pages
Transfer Learning
No ratings yet
Transfer Learning
60 pages
Transfer Learning Through Embedding Spaces (Z-Lib - Io)
No ratings yet
Transfer Learning Through Embedding Spaces (Z-Lib - Io)
223 pages
NLP Transfer Learning Thesis
No ratings yet
NLP Transfer Learning Thesis
329 pages
A Comprehensive Survey On Transfer Learning
No ratings yet
A Comprehensive Survey On Transfer Learning
31 pages
ReviewPaper TransferLearning
No ratings yet
ReviewPaper TransferLearning
6 pages
Iconips Paper On Transfer Learning
No ratings yet
Iconips Paper On Transfer Learning
11 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
32 pages
Unit Iii
No ratings yet
Unit Iii
26 pages
Transfer Learning Basics & Strategies
No ratings yet
Transfer Learning Basics & Strategies
19 pages
Haeusser Iccv 17
No ratings yet
Haeusser Iccv 17
11 pages
Transfer Learning in AI: Methods & Scenarios
No ratings yet
Transfer Learning in AI: Methods & Scenarios
45 pages
A Primer On Domain Adaptation PDF
No ratings yet
A Primer On Domain Adaptation PDF
31 pages
Lecture 17 Transfer Learning
No ratings yet
Lecture 17 Transfer Learning
12 pages
Transfer Learning - Qiang Yang
No ratings yet
Transfer Learning - Qiang Yang
393 pages
Unit 4
No ratings yet
Unit 4
50 pages
Transfer Learning
No ratings yet
Transfer Learning
24 pages
Session 5
No ratings yet
Session 5
33 pages
Training The Application of LLM
No ratings yet
Training The Application of LLM
68 pages
Transfer Learning Seminar
No ratings yet
Transfer Learning Seminar
12 pages
Transfer Learning
No ratings yet
Transfer Learning
13 pages
Transfer Learning Using PNN
No ratings yet
Transfer Learning Using PNN
5 pages
Transfer Learnring
No ratings yet
Transfer Learnring
5 pages
NLP Transfer Learning Insights
No ratings yet
NLP Transfer Learning Insights
6 pages
Zhu 2018
No ratings yet
Zhu 2018
8 pages
ReviewPaper TransferLearning
No ratings yet
ReviewPaper TransferLearning
6 pages
AAM Ans
No ratings yet
AAM Ans
3 pages
07-Dlintro Deep Learning NLP
No ratings yet
07-Dlintro Deep Learning NLP
31 pages
NNDL - Unit 3CBS
No ratings yet
NNDL - Unit 3CBS
6 pages
Introduction To Deep Learning AI 2025
No ratings yet
Introduction To Deep Learning AI 2025
78 pages
Domain-Adversarial Training of Neural Networks
No ratings yet
Domain-Adversarial Training of Neural Networks
35 pages
Arijit Dey - 34230822006 - PCCAIML 602
No ratings yet
Arijit Dey - 34230822006 - PCCAIML 602
15 pages
Deep - Learning
No ratings yet
Deep - Learning
49 pages
AdvAI Unit4
No ratings yet
AdvAI Unit4
79 pages
Multitask Transfer
No ratings yet
Multitask Transfer
36 pages
AML - Lecture - 12 - 29nov24
No ratings yet
AML - Lecture - 12 - 29nov24
73 pages
2022, CNN Architectures, Transformers (Like ViT)
No ratings yet
2022, CNN Architectures, Transformers (Like ViT)
40 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
Essay 6
No ratings yet
Essay 6
15 pages
2018 - Kouw - An Introduction To Domain Adaptation and Transfer Learning
No ratings yet
2018 - Kouw - An Introduction To Domain Adaptation and Transfer Learning
41 pages
Abstract:: Keywords: Transfer Learning, Convolutional Neural Networks (Convnets), Imagenet, Vgg16
No ratings yet
Abstract:: Keywords: Transfer Learning, Convolutional Neural Networks (Convnets), Imagenet, Vgg16
11 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
11.RNN and Transformers
No ratings yet
11.RNN and Transformers
100 pages
CH 5
No ratings yet
CH 5
16 pages
17.feature-Based Distant Domain Transfer Learning
No ratings yet
17.feature-Based Distant Domain Transfer Learning
8 pages
Meta-Learning & Transfer Learning
No ratings yet
Meta-Learning & Transfer Learning
56 pages
Strategy: The Totality of Decisions - 47
No ratings yet
Strategy: The Totality of Decisions - 47
1 page
Hsgraduation6 28
No ratings yet
Hsgraduation6 28
4 pages
Unit Title: Coral Reefs: Goal 3.2: Understand The Relationship Between Matter and Energy in Living Systems
No ratings yet
Unit Title: Coral Reefs: Goal 3.2: Understand The Relationship Between Matter and Energy in Living Systems
9 pages
Cambridge Homeschooling Guide
No ratings yet
Cambridge Homeschooling Guide
12 pages
Differentiation Formulas - Derivative Formulas List
No ratings yet
Differentiation Formulas - Derivative Formulas List
13 pages
(PDF) The Elusive Definition of Knowledge
0% (1)
(PDF) The Elusive Definition of Knowledge
13 pages
Flight of Dreams A Novel Lawhon PDF Download
No ratings yet
Flight of Dreams A Novel Lawhon PDF Download
102 pages
Unit of Work
No ratings yet
Unit of Work
23 pages
Adverbs
No ratings yet
Adverbs
2 pages
Line Rockets Lesson Plan k-2nd
No ratings yet
Line Rockets Lesson Plan k-2nd
11 pages
I-Ready Placement Tables 2017-2018final
No ratings yet
I-Ready Placement Tables 2017-2018final
6 pages
Script For Project Control
No ratings yet
Script For Project Control
8 pages
National Apprenticeship Training Scheme: Student User Manual
No ratings yet
National Apprenticeship Training Scheme: Student User Manual
41 pages
The Desert Fathers Sayings of The Early Christian Monks 654419
0% (3)
The Desert Fathers Sayings of The Early Christian Monks 654419
6 pages
Digital Literacy For The 21st Century: Rethinking & Redesigning The Roles of Libraries
No ratings yet
Digital Literacy For The 21st Century: Rethinking & Redesigning The Roles of Libraries
6 pages
Simolazione Seconda Traccia Inglese 2023 Extra
No ratings yet
Simolazione Seconda Traccia Inglese 2023 Extra
5 pages
Grade 6 Quiz Bee Guidelines 2023
No ratings yet
Grade 6 Quiz Bee Guidelines 2023
3 pages
West Bengal State University: CBCS, Sem-I Examination, 2018 Regular Candidate
No ratings yet
West Bengal State University: CBCS, Sem-I Examination, 2018 Regular Candidate
1 page
Officer Tryout Leadership Questions
No ratings yet
Officer Tryout Leadership Questions
5 pages
3RD PT Eng3 Tos
No ratings yet
3RD PT Eng3 Tos
2 pages
MSc Financial Economics Guide
No ratings yet
MSc Financial Economics Guide
4 pages
Homework and Remembering 5th Grade Volume 1
50% (2)
Homework and Remembering 5th Grade Volume 1
5 pages
Nepal Pokhara Affiliated College List.
No ratings yet
Nepal Pokhara Affiliated College List.
3 pages
Onni Annisa - Nim 155110501111053 - Skripsi-2
No ratings yet
Onni Annisa - Nim 155110501111053 - Skripsi-2
154 pages
Accomplishment Report BAC Coordinatorship
No ratings yet
Accomplishment Report BAC Coordinatorship
2 pages
Week 27 Class Vi, History
No ratings yet
Week 27 Class Vi, History
10 pages
ChamPock An
No ratings yet
ChamPock An
160 pages
How To Improve Student English-Speaking Skill
No ratings yet
How To Improve Student English-Speaking Skill
2 pages
Group Assignment IT Audit
No ratings yet
Group Assignment IT Audit
24 pages

Lecture3 Transfer Learning

Uploaded by

Lecture3 Transfer Learning

Uploaded by

ECE 498: ST:

Introduction to Applied Machine Learning

• Tao Han, Ph.D.

Data not directly related to the task considered

elephant tiger dog cat

Similar domain, different tasks Different domains, same task

Warning: different terminology in different literature

Target data (e.g.

Input layer 1. Only train the rest layers (prevent

Pixels Layer 1 Layer 2 Layer L

Warning: different terminology in different literature

Hours of training data for Mandarin

Warning: different terminology in different literature

Domain shift: Training and testing data have different

This is “0”. This is “1”.

“4” “0” “1”

• Idea: training a model by source data,

“4” “0” “1”

Large amount of Little but

• Feature extractor: Learn 𝜃𝑑∗ = min 𝐿𝑑

Source and target data Target data (unlabeled

Used in Decision-boundary Iterative Refinement Training with

Maximum Classifier Discrepancy https://arxiv.org/abs/1712.02560

Warning: different terminology in different literature

• Source data: 𝑥 𝑠 , 𝑦 𝑠 Training data Different

𝑦𝑠: cat dog …… Alpaca

How we solve this problem?

sufficient attributes for one

sufficient attributes for one

You might also like