Machine Learning
(BITS F464)
Dr.N.L.Bhanu Murthy
BITS Pilani
Hyderabad Campus
What is Learning?
“Gain knowledge or understanding of or skill in by study,
instruction or experience” - Webster
BITS Pilani, Hyderabad Campus
What is Learning?
“Learning is any process by which a system improves
performance from experience.” - Herbert Simon
Researcher in Professor @
Artificial Intelligence Carnegie Mellon University
Cognitive psychology University of California,
Computer science Berkeley
Economics Illinois Institute of Technology
Political science
Awards:
Turing Award, 1975
Nobel Prize in Economics1978
National Medal of Science1986 1916 - 2001
von Neumann Theory Prize1988
BITS Pilani, Hyderabad Campus
What is Machine Learning?
Machine Learning is study of
algorithms that
improve their performance P
at some task T
with experience E
Tom Mitchell (1990)
Well-defined learning task: <P,T,E>
BITS Pilani, Hyderabad Campus
Example - Machine Learning
Handwritten Digit Recognition
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words
BITS Pilani, Hyderabad Campus
Example - Machine Learning
T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.
BITS Pilani, Hyderabad Campus
Example - Machine Learning
BITS Pilani, Hyderabad Campus
Example - Machine Learning
BITS Pilani, Hyderabad Campus
Example - Machine Learning
Learning to drive an autonomous
vehicle (Pomerleau, 1989).
BITS Pilani, Hyderabad Campus
Example - Machine Learning
BITS Pilani, Hyderabad Campus
Example - Machine Learning
BITS Pilani, Hyderabad Campus
Examples of Successful Applications of Machine Learning
Learning to recognize spoken words
(Lee, 1989; Waibel, 1989).
Learning to classify new astronomical structures
(Fayyad et al., 1995).
Learning to play world-class backgammon
(Tesauro 1992, 1995).
Categorize email messages as spam or legitimate.
BITS Pilani, Hyderabad Campus
Gary Kasparov on loss to Deep Blue
• Human uses 1% calculation, 99% understanding
– based on patterns, drawing information from experience
• Machine opposite: 99% calculation 1% understanding
– though this understanding is growing
BITS Pilani, Hyderabad Campus
Machine Learning, a Magic?
No, more like gardening
Seeds = Algorithms
Nutrients = Data
Gardener = You
Plants = Programs
BITS Pilani, Hyderabad Campus
Machine Learning in Computer Science
Speech/Au Robotics
dio
Processing Planning
Natural
Language
Processing
Machine Vision/Image
Processing
Biomedical/Chemed
ical Learning
Informatics
Human Financial Modeling
Computer Analytics
Interaction
BITS Pilani, Hyderabad Campus
They said it!!
“A breakthrough in machine learning would be worth ten Microsofts”
- Bill Gates, Chairman, Microsoft
Machine learning is the hot new thing”
- John Hennessy, President, Stanford
“Web rankings today are mostly a matter of machine learning”
- Prabhakar Raghavan, Dir. Research, Yahoo
“Machine learning is going to result in a real revolution”
- Greg Papadopoulos, CTO, Sun
“Machine learning is today’s discontinuity” - Jerry Yang, CEO, Yahoo
“Machine learning is the next Internet”
- Tony Tether, Director, DARPA
BITS Pilani, Hyderabad Campus
Future prospects..
BITS Pilani, Hyderabad Campus
History of Technology
BITS Pilani, Hyderabad Campus
12 IT skills that employers can't say no to
1) Machine learning
2) Mobilizing applications
3) Wireless networking
4) Human-computer interface
5) Project management
6) General networking skills
7) Network convergence technicians
8) Open-source programming
9) Business intelligence systems
10) Embedded security
11) Digital home technology integration
12) .Net, C #, C ++, Java -- with an edge
BITS Pilani, Hyderabad Campus
History of Machine Learning
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM
BITS Pilani, Hyderabad Campus
History of Machine Learning (cont.)
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism,
backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
BITS Pilani, Hyderabad Campus
History of Machine Learning (cont.)
• 2000s
– Support vector machines
– Kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications
• Compilers
• Debugging
• Graphics
• Security (intrusion, virus, and worm detection)
– Email management
– Personalized assistants that learn
– Learning in robotics and vision
BITS Pilani, Hyderabad Campus
Why does Machine Learning need math?
Calculus
– We need to identify the maximum likelihood, or minimum risk.
Optimization
– Integration allows the marginalization of continuous probability
density functions
Linear Algebra
– Many features leads to high dimensional spaces
– Vectors and matrices allow us to compactly describe and manipulate
high dimensional feature spaces.
Vector Calculus
– All of the optimization needs to be performed in high dimensional
spaces
– Optimization of multiple variables simultaneously – Gradient Descent
– Want to take a marginal over high dimensional distributions like
Gaussians.
BITS Pilani, Hyderabad Campus
Teaching & Evaluation (BITS C464 – L P U – 3 0 3)
Evaluation Components & Criteria
Component Weightage Duration Date Mode
(out of 200)
Mid Test 50 90 minutes As per Closed Book
Timetable
Quiz(2) 30 (each quiz 30 minutes Closed Book
TBD
for 15)
Assignments 40 Open Book
Comprehensive 80 3 hours As per Closed Book
Timetable
Make-up Policy: Make-up for other tests will be granted on prior permission and on
justifiable grounds only.
Course Notices: All notices pertaining to this course will be displayed on the LTC
Notice Board as well as the CS & IS Notice Board.
Chamber Consultation: Friday 1600 Hrs – 1700 Hrs
BITS Pilani, Hyderabad Campus
Text Book
T1. Christopher Bishop: Pattern Recognition and Machine Learning,
Springer International Edition.
BITS Pilani, Hyderabad Campus
Reference Book
R1. Tom M. Mitchell: Machine Learning, The McGraw-Hill Companies, Inc.
BITS Pilani, Hyderabad Campus
Reference Book
BITS Pilani, Hyderabad Campus
Reference Book
BITS Pilani, Hyderabad Campus
Reference Book
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
BITS Pilani, Hyderabad Campus
Supervised Learning
Decision Tree Learning
Target Concept
“Days on which my friend, yar, enjoys his favorite water sport”
(you may find it more intuitive to think of
“Days on which the beach will be crowded” concept)
Task
Learn to predict the value of EnjoySport/Crowded for an arbitrary day
Training Examples for the Target Concept
Example Sky Air Humidity Wind Water Forecast Enjoy
Temp Sport
0 Sunny Warm Normal Strong Warm Same Yes
1 Sunny Warm High Strong Warm Same Yes
2 Rainy Cold High Strong Warm Change No
3 Sunny Warm High Strong Cool Change Yes
BITS Pilani, Hyderabad Campus
Supervised Learning
Decision Tree Learning
Hypothesis space search
Occam’s razor
Overfitting
Measure for Selecting attributes –
Entropy, Gini Index etc.
Issues in DT learning
Outlook
Sunny Overcast Rain
Humidity Yes Wind
High Normal Strong Weak
No Yes No Yes
BITS Pilani, Hyderabad Campus
Generative and Discriminative Models: An analogy
Generative approach is to learn each language and determine
as to which language the speech belongs to
Discriminative approach is determine the linguistic differences
without learning any language– a much easier task!
BITS Pilani, Hyderabad Campus
Taxonomy of ML Models
BITS Pilani, Hyderabad Campus
Supervised Learning
denotes +1
Support Vector Machine (SVM)
denotes -1
x2
V. Vapnik
x1
BITS Pilani, Hyderabad Campus
Supervised Learning
denotes +1
Support Vector Machine (SVM)
denotes -1
x2
V. Vapnik
x1
BITS Pilani, Hyderabad Campus
Supervised Learning denotes +1
denotes -1
x2
Support Vector Machine (SVM)
V. Vapnik
x1
BITS Pilani, Hyderabad Campus
Supervised Learning denotes +1
denotes -1
x2
Support Vector Machine (SVM)
V. Vapnik
x1
BITS Pilani, Hyderabad Campus
Supervised Learning denotes +1
denotes -1
x2
Support Vector Machine (SVM) Margin
“safe zone”
V. Vapnik
x1
BITS Pilani, Hyderabad Campus
Supervised Learning
Artificial Neural Networks (ANN)
Networks of processing units (neurons) with connections
(synapses) between them
Large number of neurons: 1014
Large connectitivity: 104
Parallel processing
Distributed computation/memory
Robust to noise, failures
BITS Pilani, Hyderabad Campus
Supervised Learning
Regression
Polynomial Curve Fitting
Model Selection
Overfitting & Regularization
Probabilistic interpretation
Bayesian curve fitting
Linear Basis Function Models
Bias – Variance Decomposition
Bayesian Linear Regression
BITS Pilani, Hyderabad Campus
Unsupervised Learning
Clustering Algorithms
BITS Pilani, Hyderabad Campus
Model / Hypothesis Evaluation
t Tests
Precision
Recall
F-Measure
AuC for ROC curves
R2
Spearman / Pearson Correlation
BITS Pilani, Hyderabad Campus
Thank You!!
BITS Pilani, Hyderabad Campus