0% found this document useful (0 votes)

21 views20 pages

Lect8 IoT BigDataAnalyticsTechniques

Uploaded by

almuhtarif.egyptian.yahoo.com

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views20 pages

Lect8 IoT BigDataAnalyticsTechniques

Uploaded by

almuhtarif.egyptian.yahoo.com

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Big Data Analytics

Techniques
Presented by
Dr. Amany AbdElSamea

1
Outline
• Clustering
• Applications for Cluster Analysis
• Hard Clustering vs. Soft Clustering
• Types of Clustering
• K-mean clustering
• Association Rules
• Linear Regression
• Logistic Regression

2
Big Data Techniques
Problem to solve Category of techniques Example

I want to group items by similarity Clustering K-mean clustering

I want to discover relationships Association rules Apriori

between actions or items
I want to determine the Regression Linear regression
relationship between the outcome Logistic regression
and the input variables
I want to analyze my text data Text analysis Term-Frequency-Inverse
Document-Frequency
(TF-IDF)
I want to assign known labels to Classification Naïve Bayes
objects Decision trees
I want to forecast the behavior of Time series analysis ARIMA
a temporal process
Clustering
• Clustering is the process of dividing
the datasets into groups, consisting
of similar data-points.
• It is unsupervised machine learning
technique
• Points in the same group are as
similar as possible.
• Points in different group are as
dissimilar as possible.
Applications for Cluster Analysis
• Marketing: discover distinct groups in customer
bases, and develop targeted marketing programs.
• Biology: plant and animal taxonomies, genes
functionality
• City planning: identify groups of houses according
to their house type, value, and geographical
location
• Also used for pattern recognition, data analysis, and
image processing
Types of Clustering
• Exclusive Clustering:
 Hard Clustering:
 Data point/ Item belongs exclusively to one
cluster
 For example: k-Means Clustering
• Overlapping Clustering:
 Soft Cluster
 Data Point/Item belongs to multiple cluster
 For example: Fuzzy/ C-Means Clustering
• Hierarchical Clustering:
 The hierarchy of clusters is developed in the
form of a tree in this technique, and this
tree-shaped structure is known as the
dendrogram.
K-mean clustering
• It is a type of unsupervised learning used when you have
unlabeled data
• Aims to partition n observations into k clusters in which each
observation belongs to the cluster with the nearest mean.
• Input: Numerical. There must be a distance metric defined
over the variable space
- Euclidian distance
• Output: The centers of each discovered cluster, and the
assignment of each input datum to a cluster.
- Centroid
K-mean Steps
1. Choose the value of k and the initial guesses
for the centroids
2. Compute the distance from each data point to
each centroid, and assign each point to the
closest centroid
3. Compute the centroid of each newly defined
cluster from step 2
4. Repeat steps 2 and 3 until the algorithm
converges (no changes occur)
Step 1

Set k = 3 and initial clusters centers

Step 2

Points are assigned to the closest centroid

Step 3

Compute centroids of the new clusters

Step 4

• Repeat steps 2 and 3 until convergence

• Convergence occurs when the centroids do not
change or when the centroids oscillate back and
forth
– This can occur when one or more points have equal
distances from the centroid centers
Picking K
Association Rule
• Association rules is another unsupervised learning method.
• Not a predictive method. There is no prediction performed,
but this method is used to discover relationships within the
data.
• Help identify interesting patterns and connections among sets
of items:
- Rules take the form of “If X is observed, then Y is also observed”
• Use case: Understand customer buying habits by finding
associations between the different items that customers place
in their “shopping basket”
– Known as market basket analysis
– Example Apriori algorithm
Regression
• Regression focusses on the relationship between an outcome and
its input variables.
- Provides an estimate of the outcome based on the input values
- Models how changes in the input variables affect the outcome
• Regression can find the input variables having the greatest
statistical influence on the outcome
– Then, can try to produce better values of input variables
– E.g. – if 10-year-old reading level predicts students’ later
success, then try to improve early age reading level
• Approaches: Linear regression and Logistic regression
Linear Regression
• Models the relationship between several input
variables and a continuous outcome variable
– Assumption is that the relationship is linear
– Various transformations can be used to achieve a
linear relationship
• Linear regression models are probabilistic
– Involves randomness and uncertainty
– Not deterministic like Ohm’s Law (V=IR)
Model Description
>

Logistic Regression

• In linear regression modeling, the outcome variable

is continuous – e.g., income ~ age and education

• In logistic regression, the outcome variable is

categorical, like true/false, pass/fail, or yes/no
>

Logistic Regression
Model Description
• Logical regression is based on the logistic
function

– As y -> infinity, f(y)->1; and as y->-infinity, f(y)->0

• With the range of f(y) as (0,1), the logistic function

models the probability of an outcome occurring

In contrast to linear regression, the

values of y are not directly observed;
only the values of f(y) in terms of
success or failure are observed.
Questions

Distributed System MCQ
67% (3)
Distributed System MCQ
10 pages
Schema Masina de Spalat Indesit
100% (2)
Schema Masina de Spalat Indesit
31 pages
Chapter 3 p4
No ratings yet
Chapter 3 p4
18 pages
ML Unit4
No ratings yet
ML Unit4
19 pages
DSUP Exp5
No ratings yet
DSUP Exp5
7 pages
Unsupervised Machine Learning Techniques
No ratings yet
Unsupervised Machine Learning Techniques
58 pages
Module 6 - Un-Supervised Learning Algorithms
No ratings yet
Module 6 - Un-Supervised Learning Algorithms
31 pages
UnSupervised Learning
No ratings yet
UnSupervised Learning
3 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Machine Learning - Iv
No ratings yet
Machine Learning - Iv
13 pages
Outline: Three Basic Algorithms
No ratings yet
Outline: Three Basic Algorithms
34 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
47 pages
Predict Classify Cluster
No ratings yet
Predict Classify Cluster
12 pages
Business Analytics: Aviral Apurva Anureet Bansal Devansh Agarwaal Dhwani Dhingra Chirag Verma
No ratings yet
Business Analytics: Aviral Apurva Anureet Bansal Devansh Agarwaal Dhwani Dhingra Chirag Verma
49 pages
CH 5
No ratings yet
CH 5
34 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
5 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
MCA Machine Learning Practical File
No ratings yet
MCA Machine Learning Practical File
22 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
31 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
49 pages
Untitled Document 15
No ratings yet
Untitled Document 15
7 pages
2nd Unit NN Final Class Notes
No ratings yet
2nd Unit NN Final Class Notes
50 pages
Unit 3 Unsupervised Learning & Neural Network
No ratings yet
Unit 3 Unsupervised Learning & Neural Network
21 pages
Introduction To Machine Learning-Presentation
No ratings yet
Introduction To Machine Learning-Presentation
28 pages
Unsupervised Learning: Niveditha. GH
No ratings yet
Unsupervised Learning: Niveditha. GH
10 pages
Michael Melese (PH.D.) Michael - Melese@aau - Edu.et
No ratings yet
Michael Melese (PH.D.) Michael - Melese@aau - Edu.et
22 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Week 9. Unsupervised Learning
No ratings yet
Week 9. Unsupervised Learning
32 pages
Machine Learning Clustering AlgorithmsI
No ratings yet
Machine Learning Clustering AlgorithmsI
129 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
43 pages
Unit 5
No ratings yet
Unit 5
38 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
49 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Unit 2 R Programming
No ratings yet
Unit 2 R Programming
15 pages
Unit 4
No ratings yet
Unit 4
74 pages
Unit 3
No ratings yet
Unit 3
58 pages
Lecture 1.1 1.2
No ratings yet
Lecture 1.1 1.2
11 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
Unit 6
No ratings yet
Unit 6
22 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Big Data Analytics Algorithm, Tools in Systematic Review
No ratings yet
Big Data Analytics Algorithm, Tools in Systematic Review
7 pages
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
No ratings yet
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
8 pages
Evolutional Study On KNN and K-Means Algorithms (SP)
No ratings yet
Evolutional Study On KNN and K-Means Algorithms (SP)
9 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
Cluster Analysis P1
No ratings yet
Cluster Analysis P1
35 pages
BCA Semester VI Data Mining Module 4 (Presentation Kind of N
No ratings yet
BCA Semester VI Data Mining Module 4 (Presentation Kind of N
56 pages
SML Hand Note Bau by DT
No ratings yet
SML Hand Note Bau by DT
1 page
Machine Learning
No ratings yet
Machine Learning
56 pages
Clustering Algorithm and Analyasis
No ratings yet
Clustering Algorithm and Analyasis
12 pages
UNIT-5 Material
No ratings yet
UNIT-5 Material
42 pages
Business Analytics: Data Mining Guide
No ratings yet
Business Analytics: Data Mining Guide
81 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Reference Papers
No ratings yet
Reference Papers
7 pages
04-FSSR DS610 2024 2025T1 Kmeans
No ratings yet
04-FSSR DS610 2024 2025T1 Kmeans
57 pages
Data Mining For BI - Part 5
No ratings yet
Data Mining For BI - Part 5
34 pages
Predictive Modeling
No ratings yet
Predictive Modeling
52 pages
Business Analytics & Data Mining Guide
No ratings yet
Business Analytics & Data Mining Guide
7 pages
w6 Clustering
No ratings yet
w6 Clustering
29 pages
09.0 Data Storage09.0 Data Storage09.0 Data Storage
No ratings yet
09.0 Data Storage09.0 Data Storage09.0 Data Storage
36 pages
7. Install PacketBeat Agent on Windows
No ratings yet
7. Install PacketBeat Agent on Windows
1 page
ML_IntroML_IntroML_IntroML_IntroML_Intro
No ratings yet
ML_IntroML_IntroML_IntroML_IntroML_Intro
79 pages
01 Introduction to IoT01 Introduction to IoT
No ratings yet
01 Introduction to IoT01 Introduction to IoT
70 pages
Android OS and Development Guide
No ratings yet
Android OS and Development Guide
10 pages
Assignment2 AI Dec2024
No ratings yet
Assignment2 AI Dec2024
2 pages
Lab 3
No ratings yet
Lab 3
11 pages
Lab 2
No ratings yet
Lab 2
16 pages
Lab 5
No ratings yet
Lab 5
20 pages
IBM IT Support
No ratings yet
IBM IT Support
1 page
Lab 1
No ratings yet
Lab 1
7 pages
Lab 2
No ratings yet
Lab 2
16 pages
Lab 3
No ratings yet
Lab 3
31 pages
Lab 1
No ratings yet
Lab 1
18 pages
Lab 2
No ratings yet
Lab 2
8 pages
06 MQTT - The Standard For IoT Messaging
No ratings yet
06 MQTT - The Standard For IoT Messaging
49 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Network Programming Assignment
No ratings yet
Network Programming Assignment
5 pages
Lecture 5.part1
No ratings yet
Lecture 5.part1
15 pages
Lec 3
No ratings yet
Lec 3
18 pages
Submitted by Abdelwahab Mohamed
No ratings yet
Submitted by Abdelwahab Mohamed
13 pages
Lec 2
No ratings yet
Lec 2
34 pages
NWProg S23
No ratings yet
NWProg S23
68 pages
NWProg S23
No ratings yet
NWProg S23
21 pages
CS441 - Lecture 5 - Use Case Diagrams
No ratings yet
CS441 - Lecture 5 - Use Case Diagrams
31 pages
Lec3 ENSA - Module - 1
No ratings yet
Lec3 ENSA - Module - 1
26 pages
Prepared by
No ratings yet
Prepared by
3 pages
Datasheet ST S5H100
No ratings yet
Datasheet ST S5H100
5 pages
Kubernetes Sec
No ratings yet
Kubernetes Sec
73 pages
Project 12
No ratings yet
Project 12
44 pages
B.Tech CSE Algorithm Design Notes
No ratings yet
B.Tech CSE Algorithm Design Notes
126 pages
Hdo6000a Operators Manual
No ratings yet
Hdo6000a Operators Manual
212 pages
ATM Banking System (18192203029)
No ratings yet
ATM Banking System (18192203029)
4 pages
Workflow Attributes - HTML Body
No ratings yet
Workflow Attributes - HTML Body
12 pages
Android App Development Exercises
No ratings yet
Android App Development Exercises
89 pages
Power Supply Unit Ps-203-60A: Unicont SPB LTD
No ratings yet
Power Supply Unit Ps-203-60A: Unicont SPB LTD
7 pages
Example Network Diagram: Msa Bts1 Bsc1 Msc/Vlr1 Air Interface/Lapdm Abis Interface/Lapd A Interface Map - E Interface
No ratings yet
Example Network Diagram: Msa Bts1 Bsc1 Msc/Vlr1 Air Interface/Lapdm Abis Interface/Lapd A Interface Map - E Interface
40 pages
Database Design Assignment Guide
No ratings yet
Database Design Assignment Guide
4 pages
Unit 01-1
No ratings yet
Unit 01-1
33 pages
Porn Site Block List for Parents
0% (1)
Porn Site Block List for Parents
97 pages
Bluetooth Communication Using A Touchscreen Interface With The Raspberry Pi
No ratings yet
Bluetooth Communication Using A Touchscreen Interface With The Raspberry Pi
4 pages
Computer Hardware Assessment Package LS 6
No ratings yet
Computer Hardware Assessment Package LS 6
21 pages
Smart Care
No ratings yet
Smart Care
47 pages
NetWorker 19.1 Installation Guide PDF
No ratings yet
NetWorker 19.1 Installation Guide PDF
196 pages
CORVETTE 14L PV 200813 1510 Locked
No ratings yet
CORVETTE 14L PV 200813 1510 Locked
85 pages
2013 SNUG SV Synthesizable SystemVerilog Paper
No ratings yet
2013 SNUG SV Synthesizable SystemVerilog Paper
45 pages
E-Sahal Wallet Intro Jemal
No ratings yet
E-Sahal Wallet Intro Jemal
18 pages
Pharma Code Printing Guide
No ratings yet
Pharma Code Printing Guide
12 pages
GAMMA Building Control KNX 2012
No ratings yet
GAMMA Building Control KNX 2012
324 pages
2 Static & Dynamic Web Pages
No ratings yet
2 Static & Dynamic Web Pages
24 pages
Fall 2011 - CS502 - 1
No ratings yet
Fall 2011 - CS502 - 1
3 pages
Soal Bangun Ruang - Geometry
No ratings yet
Soal Bangun Ruang - Geometry
7 pages
AI-Powered DeFi Trading Platform
No ratings yet
AI-Powered DeFi Trading Platform
22 pages
Module 6 - Spring Boot Java (MCA)
No ratings yet
Module 6 - Spring Boot Java (MCA)
113 pages
MATLAB Scripts & Functions Guide
No ratings yet
MATLAB Scripts & Functions Guide
38 pages
ATV600 Communication Parameters EAV64332 V3.6
No ratings yet
ATV600 Communication Parameters EAV64332 V3.6
324 pages

Lect8 IoT BigDataAnalyticsTechniques

Uploaded by

Lect8 IoT BigDataAnalyticsTechniques

Uploaded by

Big Data Analytics

I want to group items by similarity Clustering K-mean clustering

I want to discover relationships Association rules Apriori

Set k = 3 and initial clusters centers

Points are assigned to the closest centroid

Compute centroids of the new clusters

• Repeat steps 2 and 3 until convergence

• In linear regression modeling, the outcome variable

• In logistic regression, the outcome variable is

– As y -> infinity, f(y)->1; and as y->-infinity, f(y)->0

• With the range of f(y) as (0,1), the logistic function

In contrast to linear regression, the

You might also like