Machine Learning: Engr. Ejaz Ahmad

This document provides definitions and code snippets for common machine learning terms and techniques. It defines terms like accuracy, algorithms, attributes, bias, categorical variables, classification reports, ROC curves, and more. It also provides code examples for tasks like splitting data, measuring accuracy, preprocessing, feature scaling, pipelines, polynomial features, regression, classification, decision trees, KNN, SVMs, Naive Bayes, and measuring performance.

Uploaded by

ejaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views54 pages

Machine Learning: Engr. Ejaz Ahmad

Uploaded by

ejaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

Machine Learning


Engr. Ejaz Ahmad
Common terms

 Accuracy
% of correct prediction made by model to the total
observation
 Algorithm
A method, function or set of instruction to generate a
machine learning algorithm

 Attribute
A quality describing an observation
Common terms

 Bias
Bias is error due to overly simplistic in training
data. causes under fitting
 Bias matric
Average difference between prediction and
Observed values
 Bias Terms
Allow model to represent patterns that do not pass
through the origin
Common terms

 Categorical Variables
Variable with a discrete possible set of values
Classification_report

 Recall
Amount of true positive rate with compare to the
actual positive event throughout data
 Precision
It is the amount of positive predictive value and it is a
measure of the amount of accurate positives your model
claims compared to the number of positives it actually
claims
Classification_report

 F1 Error
It is the measure of model performance, it is the
average weight of the precision and recall of the model

 From sklearn.metrics import classification_report

 classification_report(y_test, y_predict)
ROC_curve

 Receiver operating characteristics is used for visual
comparison of classification model
Common terms

 Convergence
State reach during the training model when the
Loss change very little during each titration
 Dimension
Number of feature have in data sets
 Extrapolation
Making prediction out side the range of data sets
Common terms

 Feature Selection
It is a process of selection of relevant feature from
data set for model
 Hyper parameter
it is higher level of model property such as ,learning
rate, depth of tree, number of hidden layers
 Normalization
Restriction of the value of the weight in regression
to avoid over fitting
Common terms

 Noise
Any irrelevant information or randomness in
dataset
 Outliers
An observation that deviates significantly from
other observation
Bayes Theorem

 Bayes Theorem
it gives the posterior probability of an event on the
bases of prior knowledge
 Generative Model
Generative Model will learn categories of data
 Discriminative model
It simply learn the distinction between different
categories of data

 Split train and test data
 From sklearn.model_selection import train_test_split
 Accuracy Measure
 From sklearn.matrics import accuracy_score
Data Preprocessing

Feature Scaling

Pipeline

Polynomial Feature



Regression

Regression

 Statistical Model for making relationship between a
dependent variable with a given set of independent
variables
1.Simple Linear Regression

 Prediction a response using a single feature
 From sklearn.linear_model import LinearRegression
 Model=LinearRegression()
 Model.fit()
 Model.intercept_,model.cofe_
 Model.predict()
Gradient Descent

 It is a very effective and simple approach to fit linear
models. The general idea for GD is tweak parameters
iteratively in order to minimize the cost function
 Types
1. Batch Gradient DescentIt use full training Batch at
each step
2. Stochastic Gradient Descent It pick any random
instance in training set and calculate the gradient
3. Mini batch Gradient Descent it calculate gradient
on small random set of instance called mini batch
Stochastic Gradient Descent

 From sklearn.linear_model import SGDRegressor
 Model=SGDRegressor(max_iter=1000,tol=1e-
3,eta0=0.1,penalty=None)
 Model.fit()
 Model.predict()
Regularized Linear Models

 For linear models regularization achieved by
constraining the weight of the model
 Following is the regularized version of linear
Regression
1. Ridge Regression
2. Lasso Regression
3. Elastic Net
Ridge Regression

 From sklearn.linear_model import Ridge
 Model=Ridge(alpha=1,solver=‘cholesky’)
 Model.fit()
 Model.predict()
 Same task can be accomplish using SGDRegression
 Model=SGDRegression(penalty=‘l2’)
 Here penalty l2 means Ridge Regression
Lasso class

 From sklearn.linear_model import Lasso
 Model=Lasso(alpha=0.1)
 Model.fit()
 Model.predict()
Elastic Net

 From sklearn.linear_model import ElasticNet
 Model=ElasticNet(alpha=0.1 , l1_ratio=0.5)
 Model.fit()
 Model.predict()
Logistic Regression

 It produce result in binary format which is used to
predict the outcome of the categorical dependent
variable.so its outcome should be
discrete/categorical, such as 0 or 1,yes or no etc.

 From sklearn.linear_model import

LogisticRegression
Classification

Classification

 It is the process of categories a give data into classes,
it can be performed on both structured and
unstructured data
Decision Tree

 It split the dataset into small segment until the target
variable are the same or until the dataset can no
longer be split
 From sklearn.tree import DecisionTreeClassifier
 Model=DecisionTreeClassifier()
 Model. fit()
 Model.predict()
Decision Tree

 Save model result
 From sklearn.externals import joblib
 Joblib.dump(model, ”filename.joblib”)
 model=joblib.load(‘filename.joblib’)
K.Nearest Neighbor

 It is Supervised learning both for Regression and
classification. The principal is to find the predefined
number of training sample closest to the new point
Support Vector Machine

 SVM is a very powerful and versatile Machine
Learning model capable for linear and non linear
classification and regression and even outliear
detection
Linear SVM Classification

 From sklearn.pipeline import Pipeline
 From sklearn.preprocessing import StandardScale
 From sklearn.svm import LinearSVC(support vector classifier)
 Svm_clf=Pipeline([(‘scaler,StandardScale),
(linear_csv,LinearSVC(C=1,loss=‘hinge’))])
 Scm_clf.fit(x, y)
 We can regulate the model by decreasing the value of C hyper
parameter
 Anotheroption is to use the SGDClassifier class, with
SGDClassifier(loss="hinge",
alpha=1/(m*C)).
Nonlinear SVM Classification

 There are two method for classification in SVM
1. Make the data linear by using Polynomial Feature
2. Use the Kernel trick
1.Method PolynomialFeatures

 polynomial_svm_clf = Pipeline([
("poly_features", PolynomialFeatures(degree=3)),
("scalar", StandardScaler()),
("svm_clf", LinearSVC(C=10, loss="hinge"))
])
 polynomial_svm_clf.fit(X, y)
2.Mathod Polynomial Kernel


 from sklearn.svm import SVC
 poly_kernel_svm_clf = Pipeline([
("scaler", StandardScaler()),
("svm_clf", SVC(kernel="poly", degree=3, coef0=1,
C=5))
])
 poly_kernel_svm_clf.fit(X, y)
2.Method Gaussian RBF Kernel


 rbf_kernel_svm_clf = Pipeline([
("scaler", StandardScaler()),
("svm_clf", SVC(kernel="rbf", gamma=5, C=0.001))
])
 rbf_kernel_svm_clf.fit(X, y)

 Increasing gamma makes the bell-shape curve narrower

SVM Regression


Linear Regression

from sklearn.svm import LinearSVR

svm_reg = LinearSVR(epsilon=1.5)

svm_reg.fit(X, y)

Nonlinear Regression

from sklearn.svm import SVR

svm_poly_reg = SVR(kernel="poly", degree=2, C=100,
epsilon=0.1)

svm_poly_reg.fit(X, y)
SGD classifier

 It is very strong classifier for handling large data
and suitable for noline learning
 from sklearn.linear_model import SGDClassifier
 sgd_clf = SGDClassifier(random_state=42)
 sgd_clf.fit(X_train, y_train_5)
Naïve Bayes

 It is a classification algorithm based on Bayes’s
Theorem
Random Forest

Artificial NeuralNetwork

Performance
Measures

Classification models
Accuracy Measure using K_Fold

 K-fold cross validation means splitting the training set
into K-folds (i.e. 2,3,4), then making predictions and
evaluating them on each fold using a model trained
on the remaining folds
 from sklearn.model_selection import cross_val_score
 cross_val_score(sgd_clf, X_train, y_train_5, cv=3,
scoring="accuracy")
Confusion Matrix

 It is a performance measurement technique of classification
 It has 4 parameter,TP,TN,FP,FN
 From sklearn.matrics import confusion_matrix
 Confusion_matrix(y_train,y_predict)
 C matrix tack two argument train and predict,we can find
y_predict for it using followni
 from sklearn.model_selection import cross_val_predict
 y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5,
cv=3)
Precision

 Precision
 It is the accuracy of positive prediction
 Precision=TP/(TP+FP)
 from sklearn.metrics import precision_score
 precision_score(y_train_5, y_train_pred)
Recall

 It is the ration of positive instance that are correctly
detected by classifier
 Recall=TP/(TP+FN)
 from sklearn.metrics import recall_score
 recall_score(y_train_5, y_train_pred)
Confusion Matrix

 Type I Error
It is False Positive error ,means calming that
something has happened but in actual it hasn't
 Type II Error
It is False Negative error, means calming that
something has not happened but in actual it happened





No-Code Automation Ebook Levity
No ratings yet
No-Code Automation Ebook Levity
36 pages
Understanding Machine Learning
100% (71)
Understanding Machine Learning
416 pages
Yonas Kenenisa Defar
No ratings yet
Yonas Kenenisa Defar
103 pages
Literature Review of Supply Chain Management System
100% (2)
Literature Review of Supply Chain Management System
7 pages
Machine Learning Study Guide
No ratings yet
Machine Learning Study Guide
69 pages
Capstone Chapter 1 3
No ratings yet
Capstone Chapter 1 3
21 pages
Module 1 Presentation
No ratings yet
Module 1 Presentation
48 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
AIML 2nd IA Question Bank
No ratings yet
AIML 2nd IA Question Bank
2 pages
PHD Thesis Deep Learning
100% (3)
PHD Thesis Deep Learning
8 pages
21 3SSREB 12023 2024AIForEveryoneFundamentals CompleteE Book
No ratings yet
21 3SSREB 12023 2024AIForEveryoneFundamentals CompleteE Book
185 pages
Predicting Salary with Experience
100% (1)
Predicting Salary with Experience
7 pages
Crop, Fertilizer, & Irrigation Recommendation Using Machine Learning Techniques
No ratings yet
Crop, Fertilizer, & Irrigation Recommendation Using Machine Learning Techniques
9 pages
Software Defect Prediction Using ML
No ratings yet
Software Defect Prediction Using ML
6 pages
Moocs Ritesh
No ratings yet
Moocs Ritesh
22 pages
ML Models
No ratings yet
ML Models
21 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Major Project Sunny - Docxhiii
0% (1)
Major Project Sunny - Docxhiii
52 pages
Lecture 1
100% (1)
Lecture 1
81 pages
IEEE Paper SPD
No ratings yet
IEEE Paper SPD
4 pages
XXX Taffesdsse2017 XXX
No ratings yet
XXX Taffesdsse2017 XXX
14 pages
AI Resume Screening Proposal
No ratings yet
AI Resume Screening Proposal
14 pages
Amharic Hate Speech Detection on Facebook
No ratings yet
Amharic Hate Speech Detection on Facebook
12 pages
The Impact of Artificial Intelligence On Branding: A Bibliometric Analysis (1982-2019)
No ratings yet
The Impact of Artificial Intelligence On Branding: A Bibliometric Analysis (1982-2019)
27 pages
ML Python
No ratings yet
ML Python
11 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
AEC AI ML Assignment
No ratings yet
AEC AI ML Assignment
22 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
2 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
Programa
No ratings yet
Programa
2 pages
Matplotlib Basics for Beginners
No ratings yet
Matplotlib Basics for Beginners
16 pages
Scenario Based Questions For Internet of Things
No ratings yet
Scenario Based Questions For Internet of Things
40 pages
Chapter - 2 - Start (1) UPDATED
No ratings yet
Chapter - 2 - Start (1) UPDATED
22 pages
Amt305 Introduction To Machine Learning, Pyq
No ratings yet
Amt305 Introduction To Machine Learning, Pyq
5 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
Radiomics in Medical Imaging - "How-To" Guide and Critical Reflection
No ratings yet
Radiomics in Medical Imaging - "How-To" Guide and Critical Reflection
16 pages
AI Residency for Career Boost
No ratings yet
AI Residency for Career Boost
5 pages
Early Lung Cancer Prediction Models
No ratings yet
Early Lung Cancer Prediction Models
8 pages
Project Requirements 23.24
No ratings yet
Project Requirements 23.24
17 pages
Machine Learning Application in Battery Prediction: A Systematic Literature Review and Bibliometric Study
No ratings yet
Machine Learning Application in Battery Prediction: A Systematic Literature Review and Bibliometric Study
8 pages
BEIT V2: Semantic Masked Image Modeling
No ratings yet
BEIT V2: Semantic Masked Image Modeling
15 pages
Comprehensive Viva Amit Rawat
No ratings yet
Comprehensive Viva Amit Rawat
12 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Stats & ML Model Comparisons
100% (1)
Stats & ML Model Comparisons
72 pages
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
100% (1)
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
38 pages
Unit-5 Decision Trees and Ensemble Learning
100% (1)
Unit-5 Decision Trees and Ensemble Learning
162 pages
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
No ratings yet
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
3 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Regression Analysis
100% (2)
Regression Analysis
9 pages
Loading The Dataset: 'Churn - Modelling - CSV'
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
6 pages
Bioinformatics F&amp M 20100722 Bujak
100% (1)
Bioinformatics F&amp M 20100722 Bujak
27 pages
Oil Export Indonesia
100% (1)
Oil Export Indonesia
12 pages
CS550 Regression Aug12
100% (1)
CS550 Regression Aug12
63 pages
Code ExerciseModelSelection
100% (1)
Code ExerciseModelSelection
19 pages
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
100% (1)
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
6 pages
Heart Disease Prediction Guide
100% (1)
Heart Disease Prediction Guide
73 pages
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
100% (1)
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
11 pages
Machine Learning Mini-Project Report
No ratings yet
Machine Learning Mini-Project Report
26 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Cheatsheet Machine Learning Tips and Tricks PDF
No ratings yet
Cheatsheet Machine Learning Tips and Tricks PDF
2 pages
Assignment Updated 101
100% (1)
Assignment Updated 101
24 pages
Maths Roadmap For Machine Learning
No ratings yet
Maths Roadmap For Machine Learning
16 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Variosalgoritmos - Jupyter Notebook
100% (1)
Variosalgoritmos - Jupyter Notebook
9 pages
Assignment No - 6-1
100% (1)
Assignment No - 6-1
3 pages
Heart Disease Prediction - Jupyter Notebook
100% (1)
Heart Disease Prediction - Jupyter Notebook
9 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
CH 6
No ratings yet
CH 6
72 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Vinee
100% (1)
Vinee
28 pages
Introduction To Python and Computer Programming 1704298503
No ratings yet
Introduction To Python and Computer Programming 1704298503
44 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Bootstrap Powerpoint
100% (1)
Bootstrap Powerpoint
20 pages
Customer Data Analysis & Feature Engineering
No ratings yet
Customer Data Analysis & Feature Engineering
35 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
2D Convolution Example & Code
No ratings yet
2D Convolution Example & Code
5 pages
By Ghazwan Khalid Auda
100% (1)
By Ghazwan Khalid Auda
17 pages
Rhel5 Guide I731
No ratings yet
Rhel5 Guide I731
200 pages
9 A.validation Methods - Jupyter Notebook
No ratings yet
9 A.validation Methods - Jupyter Notebook
3 pages
Machine Learning & Data Mining
No ratings yet
Machine Learning & Data Mining
4 pages
Oraclegg Part3 Trouble
No ratings yet
Oraclegg Part3 Trouble
41 pages
CLT 2018 Mariadb 10 2
No ratings yet
CLT 2018 Mariadb 10 2
48 pages
Database Components - MariaDB
No ratings yet
Database Components - MariaDB
4 pages
SP
No ratings yet
SP
4 pages
Database Design Best Practices
No ratings yet
Database Design Best Practices
2 pages

Machine Learning: Engr. Ejaz Ahmad

Uploaded by

Machine Learning: Engr. Ejaz Ahmad

Uploaded by

Machine Learning

 From sklearn.metrics import classification_report

 From sklearn.linear_model import

 Increasing gamma makes the bell-shape curve narrower

You might also like