0% found this document useful (0 votes)

19 views48 pages

Data Analytics (Unit-03) - 7777

Uploaded by

shreyansh1999shukla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views48 pages

Data Analytics (Unit-03) - 7777

Uploaded by

shreyansh1999shukla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

DATA ANALYTICS (UNIT-03)

Name-varundeep singh
Introduction to the Stream Concept
Data streams are continuous, unbounded, and high-speed flows of data
generated in real-time by various sources, such as sensors, social media,
network traffic, or transaction logs. Mining data streams refers to the process
of extracting meaningful patterns, insights, or knowledge from this ongoing
flow of data.
Key Characteristics of Data Streams
1.Continuous Flow: Data arrives in real-time and cannot be "paused" for
analysis.
2.Unbounded: Unlike traditional datasets, the size of a data stream is
theoretically infinite.
3.High-Speed: Data is generated and transmitted at high velocity, requiring
quick processing.
stream data model and architecture
Stream computing
1. Throughput Calculation
Problem: A stream processing system receives data at a rate of 5,000
events per second. If the system processes 25,000 events in 5 seconds,
what is the throughput of the system in events per second?
Sol:- 5,000 Ans (equal to the input)
Throughput is defined as the number of events processed per second.
Latency Estimation
Problem: A stream processing system has a total latency of 200 milliseconds to
process one event. If the input rate is 1,000 events per second, how many events
can the system process in one second without exceeding the processing
capacity?
Sol:- Ans (The system can process 5 events/second based on its latency.)
The system latency is 200 milliseconds (ms) per event. Convert this to seconds:
Latency per Event=200 ms = 0.2 seconds
The number of events the system can process in one second is:
Windowing Operations
Problem: A tumbling window of 5 seconds is applied to a data stream. If the input rate is
1,000 events per second, how many events are processed in each window?
Solution: The number of events processed in each window is:

Solution: Ans:- (Each window processes 5,000 events.)

The number of events processed in each window is:
Sliding Window Operations
Problem: A sliding window of size 10 seconds with a slide interval of 2
seconds is applied to a data stream. If the input rate is 500 events per
second, how many windows overlap at any given time?

Sol: (At any given time, 5 windows overlap.)

The overlap factor is calculated as:
Fault Tolerance and Checkpointing
Problem: A stream processing system performs checkpointing every 30
seconds. If the system crashes at 85 seconds, how many events are lost if the
input rate is 2,000 events per second?

Sol: Ans :- (The system loses 50,000 events.)

Checkpointing occurs every 30 seconds, so the last checkpoint before the
crash was at 60 seconds.
The time since the last checkpoint is:
Time since Last Checkpoint = 85 − 60 = 25seconds
The number of events lost is:
Resource Allocation
Problem: A stream processing task requires 2 GB of memory per 1,000
events/second. If the input rate is 15,000 events/second, how much
memory is required?

Sol: Ans:-(The task requires 30 GB of memory.)

The memory requirement scales linearly with the input rate:
Event-Time vs. Processing-Time Lag
Problem: Events in a stream have an average event-time delay of 2 seconds. If the
system processes events with an average latency of 3 seconds, what is the total lag
between event generation and processing?

Sol: Ans:(The total lag is 5 seconds.)

The total lag is the sum of event-time delay and processing latency:
sampling data in stream
Sampling in data streams involves selecting a subset of data points from the
continuous flow for analysis or estimation.

1. Reservoir sampling is a randomized algorithm used to sample a fixed-size

subset of items from a stream of data of unknown size N. It ensures that each
item in the stream has an equal probability of being included in the sample,
even though the size of the stream might not be known in advance.

Problem: A data stream has 1,000,000 events. You want to select a random
sample of 100 events using reservoir sampling. At the arrival of the 10,000th
event, what is the probability that this event is included in the reservoir?
• Sol:
In reservoir sampling, the probability of the i-th element being in the sample
at any point is:

For the 10,000th event:

p(selected) = 100/10,000
= 0.01
Answer: The probability that the 10,000th event is included in the sample is
1%.
2. Sliding Window Sampling
• Sliding Window Sampling is a technique used in stream processing to sample
data points from a continuous data stream within a "sliding window" of recent
events. The "window" moves continuously as new data arrives, retaining only
the most recent data while discarding older data.
Problem: A sliding window of size 1,000 events is used, and every 10th event is
sampled from the window. If the input rate is 2,000 events per second, how
many samples are collected per second?
• Solution:
1.Number of events in a second: 2,000 events
2.Sampling rate: Every 10th event is sampled.
Samples per Second = 2000/10 = 200 samples/second
Answer: The system collects 200 samples per second.

3. Stratified Sampling
• Stratified Sampling is a statistical technique used to ensure that a sample
accurately represents the underlying population by dividing the population
into distinct groups, or "strata," and then sampling proportionally from each
group. This method reduces sampling bias and improves the precision of
estimates compared to simple random sampling.

Problem: A stream has two strata:

Stratum A: Accounts for 30% of the stream.
Stratum B: Accounts for 70% of the stream. You want to collect 1,000 samples
from the stream using stratified sampling. How many samples should be taken
from each stratum?
• Sol:- In stratified sampling, the number of samples from each stratum is
proportional to its size.
Samples from Stratum A=Total Samples × Proportion of Stratum
Samples from Stratum A = 1,000×0.3 = 300
Samples from Stratum B=1,000×0.7 = 700

4. Sampling for Estimation

• Problem: A data stream has a total of 1,000,000 events, and you take a
random sample of 10,000 events to estimate the mean value of an attribute. If
the sample mean is 50 and the population standard deviation is 10, calculate
the 95% confidence interval for the population mean.
• Solution: The confidence interval for the mean is given by:
Where:
• xˉ=50 (sample mean),
• σ=10 (population standard deviation),
• n=10,000(sample size),
• z=1.96 (for 95% confidence level).

Answer: The 95% confidence interval is (49.804, 50.196).

5. Bernoulli Sampling
• Bernoulli Sampling is a probabilistic sampling technique where each item in a
dataset or data stream is independently selected with a fixed probability p. It
is named after Jacob Bernoulli, reflecting its foundation in Bernoulli trials,
where each event has only two possible outcomes: "selected" or "not
selected.“

• Problem: A data stream generates 10,000 events per second, and you apply
Bernoulli sampling with a probability of p=0.05. How many events are
expected to be sampled per second?
• Solution: The expected number of sampled events is:
Expected Samples per Second=p × Total Events per Second
Expected Samples per Second=0.05×10,000=500
Answer: 500 events are expected to be sampled per second.
filtering stream
Filtering a stream refers to the process of extracting or processing specific
pieces of data from a continuous flow of information (referred to as a
"stream"). This concept is commonly used in data processing, programming,
and systems that handle real-time data, such as sensor readings, log files, or
live social media feeds.

(code in jyupter)
Counting distinct element in a stream

We'll look at two approaches:

1. exact counting using a hash set
2. approximate counting using the Flajolet- Martin algorithm.

Exact Counting Using a Hash Set

• When counting distinct elements in a stream exactly, a hash set is a
perfect choice because it inherently stores only unique elements.
Here's a step-by-step explanation and example implementation:
• Algorithm:
1. Initialize an empty set (distinct_set)
2. Iterate through each element in the stream.
3. Add the element to the set (if it's already present, the set ignores it).
4. After processing the stream, the size of the set represents the number of
distinct elements.
Example
Stream: [5, 1, 2, 3, 5, 2, 1, 6, 7]

Final set:{5,1,2,3,6,7}
Count of distinct elements: 6
• Approximate Counting with Flajolet-Martin Algorithm
The Flajolet-Martin (FM) algorithm is a probabilistic method for estimating
the number of distinct elements in a stream. It is particularly memory-
efficient and is widely used for large-scale data streaming scenarios.
• Stream : [5, 1, 2, 3, 5, 2, 1, 6, 7]
• Walkthrough:
1.Use a hash function h(x) to generate hash values:
5 → 155 (binary: 10011011)
1 → 31 (binary: 00011111)
2 → 62 (binary: 00111110)
3 → 93 (binary: 01011101)
6 → 186 (binary: 10111010)
7 → 217 (binary: 11011001)
1. Hash Function Used:
In the example, we simulate a simple hash function for explanation purposes:
h(x) = (x × 31)mod 256

This function multiplies the input number by 31 and then takes the remainder when
divided by 256. The output will always be in the range [0,255] which fits within an 8-
bit binary number.
2. Find the position of the rightmost 1-bit for each hashed value:
155 → Position: 0
31 → Position: 0
62 → Position: 1
93 → Position: 0
186 → Position: 1
217 → Position: 0
counting oneness in a window in data stream
Problem
• Given a binary data stream S=[1,0,1,1,0,1,0,1] count the number of 1’s in a
sliding window of size W=4
• Step-by-Step Solution
We maintain:
1.A queue to represent the sliding window.
2.A counter to track the number of 1’s in the current window
• Algorithm
1.Initialize an empty queue and set the count of 1’s to zero.
• For each incoming data point : Add the new element to the queue.
• Increment the count if the new element is 1
• If the window size exceeds W, remove the oldest element from the queue :
Decrement the count if the removed element is 1
• The count at any point represents the number of 1’s in current window
Execution Walkthrough

Initial State:
•Window: []
•Count: 0

Process Each Element:

1.Add 1:
•Window: [1]
•Count: 1
2.Add 0:
•Window: [1, 0]
•Count: 1
3.Add 1:
•Window: [1, 0, 1]
•Count: 2
4.Add 1:
•Window: [1, 0, 1, 1]
•Count: 3
5.Add 0 (Exceeds window size, remove oldest):
•Window: [0, 1, 1, 0]
•Count: 2
6) Add 1:
Window: [1, 1, 0, 1]
Count: 3

7) Add 0:
Window: [1, 0, 1, 0]
Count: 2

8) Add 1:
Window: [0, 1, 0, 1]
Count: 2

Stream: [1, 0, 1, 1, 0, 1, 0, 1]
Window size: 4
After processing element 1 (1): Count of 1s = 1
After processing element 2 (0): Count of 1s = 1
After processing element 3 (1): Count of 1s = 2
After processing element 4 (1): Count of 1s = 3
After processing element 5 (0): Count of 1s = 2
After processing element 6 (1): Count of 1s = 3
After processing element 7 (0): Count of 1s = 2
After processing element 8 (1): Count of 1s = 2
• Many more approaches are as follows:-
Decaying window
A decaying window is an alternative to a fixed-size sliding window where
recent elements in the data stream are given more weight, and the
influence of older elements gradually "decays" over time. This is especially
useful in scenarios where you want to emphasize recent trends without
strictly limiting the window size.
Real-time Analytics platform ( RTAP)
Applications
A Real-Time Analytics Platform (RTAP) processes and analyzes data as it is
generated, enabling businesses and systems to act immediately on insights. RTAPs
are increasingly critical in industries where rapid decision-making is crucial. Here are
the main applications across various sectors:

• 1. E-Commerce and Retail

• Dynamic Pricing: Adjust prices in real-time based on demand, competitor prices,
and inventory levels.
• Personalized Recommendations: Suggest products based on real-time browsing
behavior, purchase history, and trends.
• Fraud Detection: Identify unusual transaction patterns or account behavior
instantly.
• Inventory Management: Monitor stock levels and predict restocking needs based
on live sales data.
2. Financial Services
• High-Frequency Trading: Analyze market data streams to execute trades within
milliseconds.
• Fraud Prevention: Detect suspicious transactions or fraudulent activities in real time.
• Risk Management: Continuously monitor portfolios and market conditions to manage
risks dynamically.
• Customer Analytics: Provide personalized banking services and investment
recommendations.

3. Healthcare
• Remote Patient Monitoring: Analyze real-time data from wearable devices to monitor
vital signs and detect anomalies.
• Predictive Diagnostics: Identify health risks by analyzing data from connected devices and
electronic health records (EHR).
4. Telecommunications
• Network Monitoring and Optimization: Identify and resolve network issues
proactively by analyzing live traffic data.
• Customer Experience Management: Personalize services and detect churn
risks through real-time usage analysis.
• Fraud Detection: Identify irregularities like SIM box fraud or unauthorized
account access.

• 5. IoT and Smart Cities

• Traffic Management: Monitor and control traffic flow based on data from
sensors, cameras, and GPS devices.
• Energy Management: Optimize power grid operations by analyzing live
energy consumption and production data.
• Public Safety: Detect and respond to emergencies like accidents or crimes
using data from IoT devices and cameras.
The Flajolet–Martin algorithm
Assume our multiset as M , with elements e1 …… en
Lets say we are given a function hash(e) which maps elements to integers in
uniformly distributed over range 0 to 2^L -1. “L” refers to the number of bits
required to represent the range of integers produced by the hash function
hash(e).
And we also define a function bit(y,k) which returns k th bit in the binary
number y , such that

Then we define a function p(y) which returns the least significant 1-bit of y ,
defined formally as :
• For convenience we will assume

INITIAL BITMAP=[0,0,0,0]
BITMAP = [1,0,1,1]
FILTERING STREAM IN DATA ANALYTICS
1. Rule-Based Filtering : Filters data using predefined conditions or thresholds. Rule-based filtering involves
applying predefined conditions or thresholds to a data stream to retain only the relevant data while
discarding the rest. The rules are typically based on domain knowledge or specific requirements.
2. Window-Based Filtering :- Processes data within a time-based or event-based
window. Window-based filtering processes a continuous data stream within a
time-based or event-based window to filter, aggregate, or summarize data.
Window Types:
• Sliding Window: Overlapping windows that continuously move over the data.
• Tumbling Window: Non-overlapping, fixed-size windows.
• Session Window: Defined by a period of inactivity in the stream.
Example 1: Time-Based Sliding Window
Scenario:
• Stream=[100,102,101,103,104,105] Use a sliding window of size 3 to calculate the average price in each
window.
Example 2: Event-Based Tumbling Window
• Scenario:
A sensor produces a stream of readings:
• Stream=[5,8,7,6,10,9,8] Aggregate every 3 readings into a single sum.
Example 3: Session Window
Scenario:
Website user clicks with timestamps (in seconds):
Stream=[5,10,15,35,40,45,90] Define a session window as <20 seconds gap
Group clicks with intervals < 20 seconds
Sessions={[5,10,15],[35,40,45],[90]}
3. Outlier Detection
Filters anomalous data using statistical methods.
Example:
Stream of hourly sales:
Stream=[200,205,210,800,215,220]
Identify and remove outliers using the Z-score method.
4. Aggregation-Based Filtering in
Data Analytics
Aggregation-based filtering involves
summarizing a data stream by applying
aggregation operations (e.g., sum,
average, maximum, minimum) over a
subset of the data (such as a window or
group). This helps filter the stream by
retaining only the aggregated
summaries or by applying thresholds to
these aggregated results.
Example 3: Social Media Monitoring
A stream of tweet counts is recorded
every hour:
Stream=[100,150,200,180,250,300,400,
350] Retain only hours where the
cumulative tweet count exceeds 500

Module 3 Mining Data Streams
No ratings yet
Module 3 Mining Data Streams
97 pages
Module 4
No ratings yet
Module 4
20 pages
MMD3
0% (1)
MMD3
17 pages
The Importance of Consumer Behavior in Marketing
No ratings yet
The Importance of Consumer Behavior in Marketing
9 pages
Module 3 Mining Data Streams
No ratings yet
Module 3 Mining Data Streams
96 pages
Data Stream Algorithms Primer
No ratings yet
Data Stream Algorithms Primer
76 pages
02 StreamsAlgorithms
No ratings yet
02 StreamsAlgorithms
93 pages
BDA Unit 2
No ratings yet
BDA Unit 2
24 pages
Big Data Unit III
No ratings yet
Big Data Unit III
20 pages
Oracle Data Modeling and Relational Database Design
No ratings yet
Oracle Data Modeling and Relational Database Design
32 pages
Methodologies For Stream Data Processing and Stream Data Systems
No ratings yet
Methodologies For Stream Data Processing and Stream Data Systems
20 pages
Unit 3
No ratings yet
Unit 3
30 pages
Mining Data Streams (Part 2)
No ratings yet
Mining Data Streams (Part 2)
56 pages
Mining Data Streams
No ratings yet
Mining Data Streams
34 pages
Approximate Frequency Counting Algorithm
No ratings yet
Approximate Frequency Counting Algorithm
87 pages
Mmd04A Streams
No ratings yet
Mmd04A Streams
78 pages
Unit 3
No ratings yet
Unit 3
49 pages
Mod2 Data Streams
No ratings yet
Mod2 Data Streams
75 pages
Ch05a Streams1
No ratings yet
Ch05a Streams1
48 pages
Introduction To Probability With Mathematica - Kevin J. Hasting PDF
100% (3)
Introduction To Probability With Mathematica - Kevin J. Hasting PDF
466 pages
Art of Selling To Intelligent People
50% (2)
Art of Selling To Intelligent People
34 pages
Mining Data Streams (Part 1)
No ratings yet
Mining Data Streams (Part 1)
46 pages
IAS Books
100% (1)
IAS Books
2 pages
Unit Ii BD
No ratings yet
Unit Ii BD
74 pages
Mining Data Stream
No ratings yet
Mining Data Stream
31 pages
Big Data Analytics - Unit 3
No ratings yet
Big Data Analytics - Unit 3
64 pages
Data Stream Unit4
No ratings yet
Data Stream Unit4
20 pages
Bda Unit3
No ratings yet
Bda Unit3
22 pages
Data Stream Sampling
No ratings yet
Data Stream Sampling
25 pages
RTDS Unit-5
No ratings yet
RTDS Unit-5
27 pages
Project Appraisal Manual
No ratings yet
Project Appraisal Manual
238 pages
Unit 4
No ratings yet
Unit 4
10 pages
UNIT-2 (Big Data)
No ratings yet
UNIT-2 (Big Data)
30 pages
MJAFI Guidelines Authors
100% (2)
MJAFI Guidelines Authors
3 pages
Data Science 5
No ratings yet
Data Science 5
82 pages
How Do You Cite A Masters Thesis
100% (3)
How Do You Cite A Masters Thesis
6 pages
Mining Data Streams
No ratings yet
Mining Data Streams
37 pages
9 RA MIRI SamplingDS
No ratings yet
9 RA MIRI SamplingDS
66 pages
Unit 2
No ratings yet
Unit 2
23 pages
Streams 2
No ratings yet
Streams 2
49 pages
Streaming Algorithms Overview
No ratings yet
Streaming Algorithms Overview
90 pages
DSBD Unit-II 3
No ratings yet
DSBD Unit-II 3
28 pages
CSE545 Sp23 (2) Streaming Algorithms 2-4
No ratings yet
CSE545 Sp23 (2) Streaming Algorithms 2-4
60 pages
Big Data
No ratings yet
Big Data
37 pages
Streaming Algorithms: Ajinkya Potdar Hemanga Krishna Borah
No ratings yet
Streaming Algorithms: Ajinkya Potdar Hemanga Krishna Borah
47 pages
Data Stream Processing Insights
No ratings yet
Data Stream Processing Insights
67 pages
Ba Hons Dissertation Examples
100% (2)
Ba Hons Dissertation Examples
7 pages
Algorithms For Massive Data Problems
No ratings yet
Algorithms For Massive Data Problems
28 pages
Streams 1
No ratings yet
Streams 1
33 pages
Stream
No ratings yet
Stream
30 pages
BigData Mod2
No ratings yet
BigData Mod2
12 pages
BDA Unit-2
No ratings yet
BDA Unit-2
12 pages
Interaction Between Caffeine and Creatine When Used As Concurrent
No ratings yet
Interaction Between Caffeine and Creatine When Used As Concurrent
11 pages
Big Data Unit Ii Notes
No ratings yet
Big Data Unit Ii Notes
19 pages
Duties of Advocates To Courts Are:: Child Welfare Committees (CWCS) and The Standards of Care in Child Care Institution
No ratings yet
Duties of Advocates To Courts Are:: Child Welfare Committees (CWCS) and The Standards of Care in Child Care Institution
52 pages
Örnek Sorular
No ratings yet
Örnek Sorular
3 pages
Blooms Filter
No ratings yet
Blooms Filter
15 pages
4 Role of Statistics in Research
100% (1)
4 Role of Statistics in Research
8 pages
Mining Techniques For Streaming Data
No ratings yet
Mining Techniques For Streaming Data
14 pages
3 4
No ratings yet
3 4
5 pages
Data Stream Clustering Explained
No ratings yet
Data Stream Clustering Explained
33 pages
DA Unit 3
No ratings yet
DA Unit 3
12 pages
Don Bosco Institute of Technology: ITDO8011 Big Data Analytics
No ratings yet
Don Bosco Institute of Technology: ITDO8011 Big Data Analytics
6 pages
Streaming 101
No ratings yet
Streaming 101
3 pages
Lec 8
No ratings yet
Lec 8
4 pages
Concertina - Life and Times of - 365-Pages
100% (4)
Concertina - Life and Times of - 365-Pages
365 pages
A Model For Assessing Digital Transformation Matur
No ratings yet
A Model For Assessing Digital Transformation Matur
26 pages
4 Bda Chapter4 Answer
No ratings yet
4 Bda Chapter4 Answer
6 pages
BA Group No - 10.
No ratings yet
BA Group No - 10.
39 pages
22amh32 - Data Analytics and Data Science Unit Iii & Sampling Data in A Stream 1. Sampling Data in A Stream
No ratings yet
22amh32 - Data Analytics and Data Science Unit Iii & Sampling Data in A Stream 1. Sampling Data in A Stream
6 pages
KOgan Et Al.2017. Technological Innovation, Resource Allocation, and Growth
No ratings yet
KOgan Et Al.2017. Technological Innovation, Resource Allocation, and Growth
48 pages
Book 160 163
No ratings yet
Book 160 163
4 pages
Opcrf Movs Checklist Sy 2022 2023
No ratings yet
Opcrf Movs Checklist Sy 2022 2023
9 pages
A
No ratings yet
A
3 pages
Mining Data Streams
No ratings yet
Mining Data Streams
17 pages
Agriculture Insurance in Nepal: Case of Banana and Livestock Insurance
No ratings yet
Agriculture Insurance in Nepal: Case of Banana and Livestock Insurance
18 pages
Service Quality in Health Care Setting
No ratings yet
Service Quality in Health Care Setting
12 pages
Benchmarking of Singapore Maritime Cluster: The Role of Cluster Facilitators
No ratings yet
Benchmarking of Singapore Maritime Cluster: The Role of Cluster Facilitators
32 pages
A Study On Graduates Mis-Matching Their Jobs: Master of Business Administration
No ratings yet
A Study On Graduates Mis-Matching Their Jobs: Master of Business Administration
35 pages
5.1 Mining Data Streams
No ratings yet
5.1 Mining Data Streams
16 pages
Educational Leadership and Management ST d646b394
No ratings yet
Educational Leadership and Management ST d646b394
16 pages
North Central Mindanao College: BM7: Organizational Management
No ratings yet
North Central Mindanao College: BM7: Organizational Management
6 pages
Real-Time Specific Energy Monitoring Enhances The Understanding of When To Pull Worn PDC Bits
No ratings yet
Real-Time Specific Energy Monitoring Enhances The Understanding of When To Pull Worn PDC Bits
10 pages
Development of The Levels of Emotional Awareness Scale For Children (LEAS-C)
No ratings yet
Development of The Levels of Emotional Awareness Scale For Children (LEAS-C)
19 pages
The Effect The Type of Surface Has On A Bouncy Balls Return Height
No ratings yet
The Effect The Type of Surface Has On A Bouncy Balls Return Height
7 pages
Procurement Management
No ratings yet
Procurement Management
2 pages
Jurnal Self Care Presentasi PDF
No ratings yet
Jurnal Self Care Presentasi PDF
10 pages
Urban Streetscape Design Insights
No ratings yet
Urban Streetscape Design Insights
2 pages
AI Resources - IBEN Webinar - DH
No ratings yet
AI Resources - IBEN Webinar - DH
2 pages

Data Analytics (Unit-03) - 7777

Uploaded by

Data Analytics (Unit-03) - 7777

Uploaded by

DATA ANALYTICS (UNIT-03)

Solution: Ans:- (Each window processes 5,000 events.)

Sol: (At any given time, 5 windows overlap.)

Sol: Ans :- (The system loses 50,000 events.)

Sol: Ans:-(The task requires 30 GB of memory.)

Sol: Ans:(The total lag is 5 seconds.)

1. Reservoir sampling is a randomized algorithm used to sample a fixed-size

For the 10,000th event:

Problem: A stream has two strata:

4. Sampling for Estimation

Answer: The 95% confidence interval is (49.804, 50.196).

We'll look at two approaches:

Exact Counting Using a Hash Set

Process Each Element:

• 1. E-Commerce and Retail

• 5. IoT and Smart Cities

You might also like