0% found this document useful (0 votes)

10 views4 pages

Big Data

Big Data refers to vast amounts of data that are challenging to process with traditional tools, characterized by the 5 V's: Volume, Velocity, Variety, Veracity, and Value. Two scaling approaches include vertical scaling (expanding a single machine) and horizontal scaling (using multiple machines). Technologies for Big Data include distributed file systems, NoSQL databases, and large-scale machine learning frameworks like Hadoop and Spark.

Uploaded by

Jaith Vindinu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views4 pages

Big Data

Uploaded by

Jaith Vindinu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Big Data

Large amount of data in petabytes, exabytes which are difficult to process using current
database management tools or traditional data processing applications.

Main 5 V’s of big data

Volume

• Tens of Thousands of IoT sensors and thousands of cameras are placed around a
massive farm and this number will not change soon. The Audio and video streams
will be of high quality.

Velocity

• It is anticipated that millions of data points will be captured and transmitted a

second from these devices.

Variety

• IOT sensors will be used to capture the environment such as temperature, humidity,
and Light level and cameras will be used to capture audio and video data.

Veracity

• The sensors and cameras cannot be verified/authenticated in real time as the

overhead is too much

Value

• The system will use AI to analyze this data and make real time decisions abouts the
automated fans functions.

Two approaches to scale up Big Data systems

Vertical scaling – Enlarge a single machine (Limited in space and expensive)

Horizontal scaling – Use many commodities machine and form computer cluster or grids.

Features of Hadoop Features of Spark

Storage Unit (Hadoop distributed file Resilience distributed dataset (Usage of
system) RAM)
Replication of data (Redundancy) 100 times faster than Hadoop and
efficiency
Map Reduce (Split the data into parts) Spark core (Help data processing among
multiple computers, maintain efficiency
and smoothness)
Map Reduce leads to efficiency in load Spark streaming (Processing real time
balancing and save time) data)
Yarn (Contain containers, Fault tolerance Spark SQL (Write directly on data set)
Spark ML (Train large scale big data model)
Cluster manager handles spark driver
processes and executors)

Weakness of Hadoop Weakness of Spark

Rely on storing data on disk Memory consumption can lead to resource
exhaustion
Data processing is slow Since Hadoop introduces the first spark is
more unmature than Hadoop
Batch processing makes in wait to When a small data process does not need
complete another batch and then coming the Memory it transfers to the disk, but with
them together and after that giving the the spark disk configurations disk is
output. insufficient because spark mainly focus on
Memory.
Do not use RAM memory Insufficient disk usage can lead to
inefficiency in resource usage.

Technologies for Big Data

Distributed File Systems – HDFS, Google File System (GFS)

Distributed/Parallel Programming (MapReduce Model)

NoSQL database – MongoDB, Cassandra

Large Scale Machine Learning – Deep learning

Data warehouses/ Data Lakes

Big Data Complete Notes
No ratings yet
Big Data Complete Notes
33 pages
Big Data
100% (1)
Big Data
190 pages
Introduction To Big Data With Spark and Hadoop
No ratings yet
Introduction To Big Data With Spark and Hadoop
61 pages
I. Models Arrius 1A Arrius 2B1 Arrius 2B1A Arrius 2F Arrius 2K1 Arrius 2B2 Arrius 1A1
50% (2)
I. Models Arrius 1A Arrius 2B1 Arrius 2B1A Arrius 2F Arrius 2K1 Arrius 2B2 Arrius 1A1
11 pages
BDA Final
No ratings yet
BDA Final
23 pages
UNIT III
No ratings yet
UNIT III
22 pages
Spark
No ratings yet
Spark
96 pages
BIG DATA Class 1 1741496163
No ratings yet
BIG DATA Class 1 1741496163
108 pages
Difference Between QPSK, OQPSK
No ratings yet
Difference Between QPSK, OQPSK
2 pages
Unit 1 (Diagrams)
No ratings yet
Unit 1 (Diagrams)
10 pages
PG Accomodation Building Construction: An Internship Report
No ratings yet
PG Accomodation Building Construction: An Internship Report
35 pages
Learnhive - CBSE Grade 5 Science Human Body - Lessons, Exercises, and Practice Tests
No ratings yet
Learnhive - CBSE Grade 5 Science Human Body - Lessons, Exercises, and Practice Tests
9 pages
BigData Session1
No ratings yet
BigData Session1
14 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
Thermoacoustic Fridge Design
No ratings yet
Thermoacoustic Fridge Design
4 pages
Master Spark Concepts
No ratings yet
Master Spark Concepts
112 pages
Big Data Technologies Presentation
No ratings yet
Big Data Technologies Presentation
10 pages
IET Udaipur BDA Unit-1
No ratings yet
IET Udaipur BDA Unit-1
10 pages
Data Science
No ratings yet
Data Science
87 pages
1.1.4 and 1.1.5
No ratings yet
1.1.4 and 1.1.5
38 pages
Big Data Analytics Presentation
No ratings yet
Big Data Analytics Presentation
30 pages
British Standards Cable
100% (1)
British Standards Cable
3 pages
Big Data Deals With Large Data Sets
No ratings yet
Big Data Deals With Large Data Sets
4 pages
Hadoop Recap
No ratings yet
Hadoop Recap
27 pages
Bigdata Intro
No ratings yet
Bigdata Intro
76 pages
BIG Data Analytics 21CSH-471: Computer Science & Engineering
No ratings yet
BIG Data Analytics 21CSH-471: Computer Science & Engineering
24 pages
IOT and Comp - Architecture
No ratings yet
IOT and Comp - Architecture
17 pages
4.big Data Platforms
No ratings yet
4.big Data Platforms
49 pages
BIT4440 BSE4040 CloudComputing 3.big Data Technologies
No ratings yet
BIT4440 BSE4040 CloudComputing 3.big Data Technologies
43 pages
Big Data - Comprehensive Summary
No ratings yet
Big Data - Comprehensive Summary
12 pages
Big Data Analysis
No ratings yet
Big Data Analysis
8 pages
Unit-I Material
No ratings yet
Unit-I Material
32 pages
Bba13 Notes BDF Unit 1
No ratings yet
Bba13 Notes BDF Unit 1
3 pages
Big Data and Mapreduce Challenges, Opportunities and Trends
No ratings yet
Big Data and Mapreduce Challenges, Opportunities and Trends
9 pages
Cloud Computing Unit-5
No ratings yet
Cloud Computing Unit-5
22 pages
4 Spark SBP
No ratings yet
4 Spark SBP
74 pages
Analyzing Big Data in Hadoop Spark
No ratings yet
Analyzing Big Data in Hadoop Spark
30 pages
BD Imp Ques 1
No ratings yet
BD Imp Ques 1
22 pages
Big Data Imp-1
No ratings yet
Big Data Imp-1
16 pages
Module 2
No ratings yet
Module 2
20 pages
BD by Maaz
No ratings yet
BD by Maaz
19 pages
Experiment No - 1 Bda
No ratings yet
Experiment No - 1 Bda
10 pages
Big Data Presentations (Autosaved)
No ratings yet
Big Data Presentations (Autosaved)
126 pages
Big Data Analytics & Distributed Platforms
No ratings yet
Big Data Analytics & Distributed Platforms
18 pages
BDA Handy Notes
No ratings yet
BDA Handy Notes
19 pages
Introduction To Spark
No ratings yet
Introduction To Spark
30 pages
Unit1 - BDH
No ratings yet
Unit1 - BDH
77 pages
MTE 2223 08 Mar 2025
No ratings yet
MTE 2223 08 Mar 2025
8 pages
Jifs223295 2
No ratings yet
Jifs223295 2
25 pages
11 - t4 - Os Sec - Os Hardening
No ratings yet
11 - t4 - Os Sec - Os Hardening
22 pages
AISE Anchor Bolt Details PDF
100% (1)
AISE Anchor Bolt Details PDF
1 page
5.hmt-B19162a-M02 - Piping Diagram of Ballast Water System - 1.0
No ratings yet
5.hmt-B19162a-M02 - Piping Diagram of Ballast Water System - 1.0
6 pages
Introduction To Big Data Technologies
No ratings yet
Introduction To Big Data Technologies
10 pages
6 - T4 - OS Security - Introduction of Linux
No ratings yet
6 - T4 - OS Security - Introduction of Linux
21 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
36 pages
Unit 2 - Intro To Hadoop
No ratings yet
Unit 2 - Intro To Hadoop
51 pages
20J41A0514-Big Data Spark
No ratings yet
20J41A0514-Big Data Spark
12 pages
Big Data Challenges and Solutions
No ratings yet
Big Data Challenges and Solutions
36 pages
Big Data
No ratings yet
Big Data
10 pages
BigData Terminology Hadoop MapReduce Yarn Spark File Formats
No ratings yet
BigData Terminology Hadoop MapReduce Yarn Spark File Formats
42 pages
Big Data
No ratings yet
Big Data
12 pages
Hadoop YARN
No ratings yet
Hadoop YARN
20 pages
12 - T4 - OS SEC - Linux Hardening
No ratings yet
12 - T4 - OS SEC - Linux Hardening
27 pages
The Big Big Data' Question Hadoop or Spark
No ratings yet
The Big Big Data' Question Hadoop or Spark
3 pages
Designing Input Filter
No ratings yet
Designing Input Filter
31 pages
Data Sheet: PRO MAX 240W 24V 10A
No ratings yet
Data Sheet: PRO MAX 240W 24V 10A
8 pages
Big Data Analysis Using Apache Spark Mllib and Hadoop Hdfs With Scala and Java
No ratings yet
Big Data Analysis Using Apache Spark Mllib and Hadoop Hdfs With Scala and Java
8 pages
Introduction To Spark
No ratings yet
Introduction To Spark
84 pages
Apache Hadoop and Spark:: and Use Cases For Data Analysis
No ratings yet
Apache Hadoop and Spark:: and Use Cases For Data Analysis
48 pages
Identify Your Imac Model - Apple Support
No ratings yet
Identify Your Imac Model - Apple Support
1 page
Big Data Training
No ratings yet
Big Data Training
244 pages
Elektra Micro Casa A Leva Coffee Machine
No ratings yet
Elektra Micro Casa A Leva Coffee Machine
12 pages
Enterprise Data Storage and Analysis On Spark
No ratings yet
Enterprise Data Storage and Analysis On Spark
34 pages
Introduction To Big Data: Soorya Prasanna Ravichandran
No ratings yet
Introduction To Big Data: Soorya Prasanna Ravichandran
33 pages
2 - T1 - Data Security Intro and DB Access Controls
No ratings yet
2 - T1 - Data Security Intro and DB Access Controls
47 pages
The Age OF: Every Minute
No ratings yet
The Age OF: Every Minute
47 pages
Revision Questions
No ratings yet
Revision Questions
2 pages
Refrigeration & HVAC Expert Resume
No ratings yet
Refrigeration & HVAC Expert Resume
3 pages
Daftar Harga Produk TIENS
No ratings yet
Daftar Harga Produk TIENS
2 pages
10 - Big Data Architecture and Tools
No ratings yet
10 - Big Data Architecture and Tools
31 pages
Snowflake Adapter For SAP Integration Suite
No ratings yet
Snowflake Adapter For SAP Integration Suite
41 pages
FMB920 Tracker Setup Guide
No ratings yet
FMB920 Tracker Setup Guide
16 pages
The Top 10 High-Demand Jobs With Attractive Salaries
No ratings yet
The Top 10 High-Demand Jobs With Attractive Salaries
54 pages
Database Design Assignment
No ratings yet
Database Design Assignment
12 pages
ISO 27001 Investment Proposal
No ratings yet
ISO 27001 Investment Proposal
5 pages
Space Systems - Responsive Missions
No ratings yet
Space Systems - Responsive Missions
2 pages
OMRON PLC Cable Guide
No ratings yet
OMRON PLC Cable Guide
2 pages
Statistical Approach To Quality Management
No ratings yet
Statistical Approach To Quality Management
57 pages
Cross-Border Interbank Payment System (CIPS)
No ratings yet
Cross-Border Interbank Payment System (CIPS)
40 pages
Building Services for B.Tech Students
No ratings yet
Building Services for B.Tech Students
12 pages
SOS Project
No ratings yet
SOS Project
20 pages
Brady Ferrule Printer
No ratings yet
Brady Ferrule Printer
5 pages
Crafting The Methods and Results in Academic Publishing
No ratings yet
Crafting The Methods and Results in Academic Publishing
10 pages
HTML Income Tax Form & Tic Tac Toe
No ratings yet
HTML Income Tax Form & Tic Tac Toe
7 pages

Big Data

Uploaded by

Big Data

Uploaded by

Big Data

Main 5 V’s of big data

• It is anticipated that millions of data points will be captured and transmitted a

• The sensors and cameras cannot be verified/authenticated in real time as the

Two approaches to scale up Big Data systems

Vertical scaling – Enlarge a single machine (Limited in space and expensive)

Features of Hadoop Features of Spark

Weakness of Hadoop Weakness of Spark

Technologies for Big Data

Distributed File Systems – HDFS, Google File System (GFS)

Distributed/Parallel Programming (MapReduce Model)

NoSQL database – MongoDB, Cassandra

Large Scale Machine Learning – Deep learning

Data warehouses/ Data Lakes

You might also like