Chapter -1
Introduction to Big Data
and Big Data Analytics
1
Dr vasu pinnti 12/07/2024
ICT
contents
2
Evolution of Big data
sources of Big Data
What is Big Data?
Characteristic of Big Data( 5 Vs)
Tools used in Big Data
Introduction to Big Data analytics
Big Data analytics goals
Applications/use cases of Big Data analytics
Challenges of Big Data
How Hadoop solves the Big Data problem
Dr vasu pinnti 12/07/2024
ICT
Evolution of Big Data
3
The Model of Generating/Consuming Data has Changed
ld Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming
data
Dr vasu pinnti 12/07/2024
ICT
Unit of Exact Approximate Examples
Data size size Size
KB 2 10 or (103 or one A typical joke =1KB
(kilobyte ) 1024 thousand) bytes4
bytes
MB(megabyt 2 20 (106 or one Complete work of Shakespeare
e) bytes million) bytes =5MB
GB 2 30 (109 or one Ten yards of books on a shelf =
(gigabyte ) bytes billion) bytes 1GB
TB 2 40 (1012 or one All the X-rays for a large
(terabyte) bytes trillion) bytes hospital =1TB Tweets; created
daily =121TB;
PB (peta 2 50 (1015 or one All U.S. academic research B
byte) bytes quadrillion) bytes libraries = 2PB
Data processed in a day by I
Google =24PB
G
EB (exa 2 60 (1018 or one Total global data created in
byte) bytes Quintillion) bytes 2006 = 161EB D
A
ZB (zetta 2 70 (1021 or one Total amount of global data T
byte) bytes Sextillion) bytes created in 2012 = 2.7 ZB and A
expected 44 ZB by 2020
Dr vasu pinnti 12/07/2024
YBICT
(yotta 2 80
(1024 or one
Evolution of Big Data by technology
5
Dr vasu pinnti 12/07/2024
ICT
Evolution of Big Data by Internet Of
Things
6
Dr vasu pinnti 12/07/2024
ICT
Evolution of Big Data by Social Media
7
Dr vasu pinnti 12/07/2024
ICT
Evolution of Big Data by other factors
8
Dr vasu pinnti 12/07/2024
ICT
Big Data sources
9
Human Generated Data
is emails, documents, photos and tweets. We are
generating this data faster than ever. Just imagine the
number of videos uploaded to You Tube and tweets
swirling around. This data can be Big Data too.
Machine Generated Data
is a new breed of data. This category consists of
sensor data, and logs generated by 'machines'
such as email logs, click stream logs, etc. Machine
generated data is orders of magnitude larger than
Human Generated Data.
Dr vasu pinnti 12/07/2024
ICT
Big Data sources
10
Web Data
Social media data : Sites like Facebook, Twitter,
LinkedIn generate a large amount of data
Click stream data : when users navigate a website,
the clicks are logged for further analysis (like
navigation patterns). Click stream data is important in
on line advertising and E-Commerce
12+ TBs of tweet data every day
25+ TBs of
log data every day ? TBs ofdata every day
Dr vasu pinnti 12/07/2024
ICT
Big Data sources
11
sensor data : sensors embedded in roads to monitor traffic and misc.
4.6 billion
30 billion RFID tags camera phones
today world wide
(1.3B in 2005)
100s of
millions of
GPS enabled
devices sold
annually
2+ billion
people on the
Web by end
76 million smart meters in 2011
2009…
200M by 2014
Dr vasu pinnti 12/07/2024
ICT
What is Big Data?
12
Big data
is the term for a collection of data
sets so large and complex that it
becomes difficult to process using
traditional data processing
applications.
Real world examples of Big Data
Facebook : has 40 PB of data and
captures 100 TB / day
Yahoo : 60 PB of data
Twitter : 8 TB / day
EBay : 40 PB of data, captures 50TB/
day
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
13
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
14
1st V-volume
Data Volume
• 44x increase from 2009
to 2020 From 0.8
zettabytes to 35zb
• Data volume is
increasing exponentially
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
15
2nd V-velocity: Data is being generated at every
minute
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
16
3rd V-Variety: different kinds of data generated from various
sources
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
17
4th V - Veracity: uncertainties and inconsistencies in big
data
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
18
5th V - Value: Mechanism to bring correct meaning out of
the data
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
19
Dr vasu pinnti 12/07/2024
ICT
Characteristics of Big Data( 5 Vs of Big data )
20
Dr vasu pinnti 12/07/2024
ICT
Traditional DB vs Big Data
21
Traditional data base/
Big Data
data warehouse
Data Data
TB to PB PB to ZB
Only structured structured and unstructured
Hardware Hardware
big central servers computer clusters
Expensive Cost effective
Hardware reliability Unreliable HW
Limited scalability Scales further
Software Software
Centralized Distributed
Schema based Not schema based
Oracle/mysql/sql server Hadoop
Dr vasu pinnti 12/07/2024
ICT
Big data tools
22
Dr vasu pinnti 12/07/2024
ICT
What is Big data analytics
23
Dr vasu pinnti 12/07/2024
ICT
Stages in Big data analytics
24
Dr vasu pinnti 12/07/2024
ICT
Big data analytics goals
25
Dr vasu pinnti 12/07/2024
ICT
Big data analytics goals
26
1.Making organizations more smarter and efficient
Dr vasu pinnti 12/07/2024
ICT
Big data analytics goals
27
1.Making organizations more smarter and efficient
Dr vasu pinnti 12/07/2024
ICT
Big data analytics goals
28
Dr vasu pinnti 12/07/2024
ICT
Big data analytics goals
29
Dr vasu pinnti 12/07/2024
ICT
Big data analytics application domains
30
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
31
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
32
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
33
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
34
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
35
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
36
IBM Big data analytics – Big data collected by
smart meters
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
37
IBM Big data analytics – problem with smart meter big data
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
38
IBM Big data analytics – how smart meter big data analysed
Dr vasu pinnti 12/07/2024
ICT
Big data analytics use cases
39
IBM Big data analytics – IBM smart meter solution
Dr vasu pinnti 12/07/2024
ICT
Types of Big data analytics
40
Dr vasu pinnti 12/07/2024
ICT
Types of Big data analytics
41
Dr vasu pinnti 12/07/2024
ICT
Types of Big data analytics
42
Dr vasu pinnti 12/07/2024
ICT
Types of Big data analytics
43
Dr vasu pinnti 12/07/2024
ICT
Challenges/problems with Big data
44
Dr vasu pinnti 12/07/2024
ICT
Challenges/problems with Big data
45
Dr vasu pinnti 12/07/2024
ICT
Challenges/problems with Big data
46
Dr vasu pinnti 12/07/2024
ICT
HADOOP is solution to Big data
problems
47
Dr vasu pinnti 12/07/2024
ICT
HADOOP is solution to Big data
problems
48
Dr vasu pinnti 12/07/2024
ICT
HADOOP is solution to Big data
problems
49
Dr vasu pinnti 12/07/2024
ICT
HADOOP is solution to Big data
problems
50
Dr vasu pinnti 12/07/2024
ICT
Introduction to Big data and Analytics
51
ANY QUESTIONS /
DOUBTS
Dr vasu pinnti
ICT
??? 12/07/2024