Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
35 views21 pages

Foundation of Cloud IoT Edge ML

The document contains a series of assignments focused on Edge Computing, IoT, and Machine Learning concepts. It includes multiple-choice questions with answers and explanations covering topics such as edge computing architecture, real-time data processing, and the advantages of local data processing. Key themes include the roles of various components in IoT systems, the importance of low latency, and the use of machine learning models at the edge.

Uploaded by

Manimekalai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views21 pages

Foundation of Cloud IoT Edge ML

The document contains a series of assignments focused on Edge Computing, IoT, and Machine Learning concepts. It includes multiple-choice questions with answers and explanations covering topics such as edge computing architecture, real-time data processing, and the advantages of local data processing. Key themes include the roles of various components in IoT systems, the importance of low latency, and the use of machine learning models at the edge.

Uploaded by

Manimekalai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 21

Foundation of Cloud IoT Edge ML

ASSIGNMENT-I

1. Which of the following is a building block of edge computing?

a) Data ingestion and stream processing

b) Centralized data centers

c) High-bandwidth CDN

d) Traditional three-tier architecture

Answer: a) Data ingestion and stream processing

Solution: Edge computing requires efficient data ingestion (e.g., using Kafka) and stream processing for real-
time data analysis. These are key building blocks for processing data at the edge, as opposed to sending data
to the cloud for processing.

2. In edge computing, which tier is responsible for running machine learning models?

a) Data Source Tier

b) Storage Tier

c) Actionable Insight Tier

d) Intelligence Tier

Answer: d) Intelligence Tier

Solution: The Intelligence Tier in edge computing is responsible for running machine learning models. While
the cloud may handle model training, the edge handles model inferencing, providing real-time insights based
on data from edge devices.

3. What is the role of M2M brokers in edge computing?

a) Data storage management

b) Enabling machine-to-machine communication

c) Training machine learning models

d) Real-time data conversion

Answer: b) Enabling machine-to-machine communication

Solution: M2M (Machine-to-Machine) brokers orchestrate communication between devices in edge


computing environments, enabling devices to exchange data without relying on centralized cloud servers

4. What is a limitation of the current cloud system for AI use cases?

a) It offers only local processing capabilities

b) It has a low capacity for data storage

c) It cannot provide real-time responses due to latency

d) It lacks programmability of the network stack

Answer: c) It cannot provide real-time responses due to latency Solution: Cloud computing, due to its
centralized nature, suffers from latency issues when AI models need to respond in real-time. This limitation
makes it unsuitable for mission-critical AI applications that require immediate feedback, which edge
computing can address by processing data locally
5. Which component is responsible for real-time queries and data processing in edge

computing?

a) Stream Processing

b) Function as a Service

c) Object Storage

d) M2M Brokers

Answer: a) Stream Processing

Solution: Stream processing allows real-time data analysis and immediate response to incoming data. In
edge computing, this capability ensures that data is processed and acted upon as it arrives, reducing the
delay in decision-making.

6. How does edge computing mimic public cloud capabilities?

a) By centralizing data storage in remote data centers

b) By providing capabilities like device management and stream analytics near data sources

c) By reducing the need for hardware innovations

d) By utilizing client-server architecture for processing

Answer: b) By providing capabilities like device management and stream analytics

near data sources

Solution: Edge computing mimics public cloud capabilities by offering features like device management,
stream analytics, and even running machine learning models close to where the data is generated, thus
enhancing real-time decision-making.

7. What is the primary purpose of the actionable insight layer in edge computing?

a) Storing unstructured data

b) Running machine learning training models

c) Sending alerts and controlling actuators

d) Performing real-time data ingestion

Answer: c) Sending alerts and controlling actuators

Solution: The actionable insight layer is responsible for converting insights from the intelligence layer into
actions, such as sending alerts to stakeholders, updating dashboards, or controlling actuators to respond to
events immediately.

8. . What is the primary advantage of edge computing over cloud computing?

a) High latency

b) Centralized processing

c) Data sovereignty

d) Limited scalability

Answer: c) Data sovereignty

Solution: The primary advantage of edge computing is that it reduces latency by processing data locally,
closer to the source, and ensures data sovereignty by keeping sensitive data within local boundaries.
9. Which IoT data flow path processes real-time data immediately upon generation?

a) Cold path

b) Warm path

c) Batch path

d) Hot path

Answer: d) Hot path

Solution: The hot path in IoT systems refers to the processing of real-time data as it is generated. This is
essential for applications that require immediate insights or actions, such as real-time monitoring of
industrial systems.

10. Which of the following is a key feature of Federated Learning?

a) Training occurs on centralized data

b) Data remains decentralized while models are aggregated

c) IoT data is processed only in the cloud

d) Training is skipped in federated models

Answer: b) Data remains decentralized while models are aggregated

Solution: Federated learning ensures that data remains decentralized, with the machine learning models
being trained on local devices. Only model updates are aggregated at a central server, preserving privacy by
keeping sensitive data on the local devices.

ASSIGNMENT-II

1. Which IoT data flow path processes real-time data immediately upon generation?

a. Cold path

b. Hot path

c. Batch path

d. Warm path

Answer: b

Explanation: Hot Path processes real-time data as soon as it is generated, ensuring immediate

insights and actions.

2. Because of _______ customers are not willing to send their data to cloud.

a. Data Integrity Concern

b. Data Privacy Concern

c. High Cost Concern

d. None of these

Answer: b

Explanation : Data Privacy Concern prevents customers from sending data to the cloud due to
fears of unauthorized access or breaches.

3. What is the role of a "Planner" component in the Edge Controller?

a. Schedule and allocate tasks to edge nodes

b. Manage communication between edge and cloud

c. Monitor IoT device health

d. Perform real-time analytics

Answer: a

Explanation: The Planner component in the Edge Controller is responsible for scheduling and

allocating tasks efficiently to edge nodes for execution.

4. Which of the following technologies is commonly used for IoT data storage and batch

processing?

a. Azure Event Hub

b. Kafka

c. Data Lake

d. IoT Hub

Answer: c

Explanation: Data Lake is widely used for storing and managing batch-processed IoT data,

ensuring that large-scale sensor information is retained for further analytics.

5. How does IoT Central differ from IoT Hub in Azure’s IoT architecture?

a. IoT Central is a SaaS-based IoT application platform, while IoT Hub is a device

management and messaging service.

b. IoT Central only supports edge computing, while IoT Hub only supports cloud

computing.

c. IoT Central processes only batch data, while IoT Hub handles real-time data.

d. IoT Central is used only for consumer IoT applications, while IoT Hub is used for

industrial applications.

Answer: a

Explanation: IoT Central is a managed SaaS platform that simplifies IoT application

development, whereas IoT Hub is a more flexible PaaS offering for managing devices

and bidirectional messaging.

6. Which of the following is an example of a real-time data processing tool in IoT?

a. Data Factory

b. Azure Synapse

c. Stream Analytics
d. Power BI

Answer: c

Explanation:

Stream Analytics is used for real-time data processing (hot path) in IoT platforms. It

enables real-time monitoring and decision-making based on incoming data.

7. What is the function of IoT Edge in an IoT architecture?

a. It acts as a middleware between cloud and devices.

b. It is used only for device registration.

c. It only manages IoT device security.

d. It provides local processing and reduces cloud dependency.

Answer: d

Explanation:

IoT Edge enables local processing of data before sending it to the cloud, reducing

latency and bandwidth usage.

8. Which of the following is NOT a component of IoT data processing architecture?

a. Hot Path

b. Cold Path

c. Warm Path

d. Static Path

Answer: d

Explanation: IoT data processing consists of Hot Path (real-time), Cold Path (batch

processing), and Warm Path (small batch processing). There is no "Static Path" in IoT

architecture.

9. What is the role of Digital Twins in an IoT ecosystem?

a. It creates a virtual model of physical IoT devices.

b. It replaces physical sensors in IoT devices.

c. It stores real-time sensor data permanently.

d. It only provides security for IoT devices.

Answer: a

Explanation:Digital Twins enable virtual representations of physical devices,

allowing real-time monitoring, simulation, and optimization of IoT applications.

10. In the IoT architecture, what does the Presentation Layer primarily handle?

a. Device management and provisioning

b. Reporting, visualization, and data APIs


c. Real-time data processing

d. IoT security and encryption

Answer: b

Explanation:

The Presentation Layer provides tools for data visualization, reports, and APIs,

allowing users to interpret IoT data effectively.

ASSIGNMENT-III

1. In the context of Edge ML, which of the following describes a key benefit of local data

processing at the edge?

A. It reduces latency by avoiding the round-trip time to cloud data centers.

B. It requires large amounts of continuous bandwidth for streaming data to the cloud.

C. It relies solely on batch processing in remote cloud servers.

D. It prevents devices from operating when offline.

Answer: A

Explanation: Processing data locally minimizes delay by eliminating the need to send data to remote

cloud servers.

2. What is the main function of a Content Delivery Network (CDN) as mentioned in the context

of cloud storage?

A. Providing containerized machine learning models

B. Scheduling data processing tasks at the edge

C. Replicating and caching data across multiple edge locations

D. Running inference on large, unstructured datasets

Answer: C

Explanation : . A CDN replicates and caches content closer to end users to reduce access latency.

3. Which step in the machine learning workflow involves feeding a model with new, unlabeled

data to generate predictions?

A. Data collection

B. Model training

C. Model deployment

D. Inference

Answer: D

Explanation:. Inference uses a trained model to predict outcomes from new input data.

4. What is the chief advantage of deploying machine learning models in containers at the edge?

A. Increased manual configuration for network resources

B. Portability and ease of updating the model near data sources


C. Requirement of high-end servers for container orchestration

D. Strict reliance on proprietary APIs for all edge services

Answer: B

Explanation: Containers allow for consistent, portable deployment and rapid updates on edge

devices

5. Azure IoT Hub is characterized by which of the following?

A. A static, one-way communication channel to the cloud

B. Absence of protocol support for IoT devices

C. A managed service offering bi-directional communication between devices and the

cloud

D. An offline-only solution that does not integrate with other Azure services

Answer: C

Explanation: Azure IoT Hub enables devices to both send data to and receive

commands from the cloud securely.

6. Which of the following object detection models is known for its single-step approach,

simultaneously predicting bounding boxes and class labels?

A. Faster R-CNN

B. SSD (Single Shot Detector)

C. Fast R-CNN

D. RCNN

Answer: B

Explanation: SSD performs object localization and classification in one forward pass, making it fast.

7. Why is specialized hardware (e.g., GPUs, NPUs) often necessary at the edge to run machine

learning workloads effectively?

A. Deep learning inference typically requires accelerated computation

B. Traditional CPUs cannot connect to IoT devices

C. It reduces the need for bandwidth and storage

D. Virtual machines cannot handle parallel computing

Answer: B

Explanation: GPUs/NPUs accelerate the complex computations required by deep

learning models, whereas the given answer “Traditional CPUs cannot connect to IoT

devices” is incorrect.

8. Which of the following is a key characteristic of the YOLOv3 object detection algorithm?

A. It selects regions in an image using a region proposal network.


B. It processes the entire image in one forward pass to predict bounding boxes and

probabilities.

C. It focuses only on classification without localizing objects

D. It relies heavily on multi-stage detection pipelines

Answer: B

Explanation: YOLOv3 uses a single network pass to quickly generate both bounding

box coordinates and class probabilities.

9. What is the first step in deploying an Edge ML workload to an IoT edge device?

A. Target the IoT edge runtime on the edge device.

B. Write a deployment manifest to define the workload.

C. Push the containers to a container registry.

D. Package the data transform, insight, and action into containers.

Answer: D

Explanation: The initial step is containerizing the workload components so they can

be deployed on the edge device.

10. What is the primary advantage of a SaaS architecture for computer vision models?

A. It eliminates the need to label images before training models.

B. It requires domain experts to manage the training process entirely.

C. It restricts the training process to the cloud without offline support.

D. It allows seamless scaling of datasets and downloading models for offline use.

Answer: D

Explanation: A SaaS model offers scalable deployment and easy access to trained

models, which can be used offline as needed.

ASSIGNMENT-IV

Q1: Which of the following control plane components is the only one that interacts directly with etcd?

A. Controller Manager

B. API Server

C. Scheduler

D. Kubelet

Answer: B

Explanation: The API Server is the sole component in Kubernetes that directly accesses etcd to read

and write cluster state.


Q2: In the context of Kubernetes, orchestration is best described as:

A. Managing and deploying containers across multiple hosts in a fault-tolerant manner

B. Running local compute jobs on a single node

C. Using hypervisors to isolate VMs on a single server

D. Ensuring minimal CPU and memory usage across cloud instances

Answer: A

Explanation: Orchestration in Kubernetes coordinates how containers are deployed, scaled, and

managed across multiple nodes.

Q3: Which of the following Kubernetes worker node components is primarily responsible for

managing pod networking and handling load balancing?

A. kube-proxy (Service proxy)

B. Scheduler

C. Container runtime

D. Kubelet

Answer: A

Explanation: The kube-proxy handles network configuration, ensuring pods and containers can

communicate and balancing traffic among multiple pod replicas.

Q4: In the Docker client-server model, which component performs the actual tasks of building,

running, and distributing containers?

A. Docker Compose

B. Docker daemon

C. Docker registry

D. Docker Desktop

Answer: B

Explanation: The Docker daemon (dockerd) listens for Docker API requests and executes container-

related tasks such as pulling images, starting containers, and managing networks.

Q5: Which maintenance strategy focuses on preventing failures by performing periodic, scheduled

maintenance based on worst-case lifetimes?

A. Reactive maintenance

B. Preventive (planned) maintenance

C. Condition-based maintenance

D. Predictive maintenance

Answer: B

Explanation: Preventive (or planned) maintenance replaces or services parts at regular intervals, rather
than relying on sensor data or sophisticated failure predictions.

Q6: In a predictive maintenance workflow, which step involves removing duplicates, dealing with

missing values, and handling outliers before modeling?

A. Define the problem

B. Prepare the data

C. Analyse the data

D. Monitor performance

Answer: B

Explanation: Data preparation (or cleaning) ensures data quality by addressing duplicates, missing

values, and outliers prior to analysis or model training.

Q7: Which of the following is an advantage of using LSTM (Long Short-Term Memory) networks for

predictive maintenance?

A. LSTMs require less data than traditional models

B. LSTMs cannot handle time-dependent sequences

C. LSTMs remember long-term patterns in sensor data

D. LSTMs only work for image recognition tasks

Answer: C

Explanation: LSTM networks are a type of recurrent neural network designed to capture long-range

dependencies, making them well-suited for time-series data in predictive maintenance.

Q8: Which statement best describes Azure Time Series Insights in the context of IoT data?

A. A serverless compute platform for deploying Docker containers

B. A PaaS offering that ingests, stores, and visualizes large volumes of time-series data from IoT

devices

C. A virtualization hypervisor for running multiple operating systems

D. A CPU-only compute service for training complex deep learning models

Answer: B

Explanation: Azure TSI provides ingestion, modeling, and visualization of IoT time-series data,

supporting analytics and integration with other Azure services.

Q9: In Kubernetes, which component assigns pods to nodes?

A. Scheduler

B. API Server

C. Controller Manager

D. Kubelet

Answer: A

Explanation: The Scheduler assigns pods to nodes by selecting the most appropriate node
based on resource availability and other constraints.

Q10: What is the primary advantage of Recurrent Neural Networks (RNNs), including LSTMs,

over Convolutional Neural Networks (CNNs) in time-series applications?

A. RNNs are better at extracting spatial features.

B. RNNs require less data preprocessing compared to CNNs.

C. RNNs focus on feature extraction rather than sequence mapping.

D. RNNs add native support for sequential data and temporal dependencies.

Answer: D

Explanation: RNNs (including LSTMs) inherently capture temporal dependencies in

sequential data, which is crucial for time-series applications.

ASSIGNMENT-V

1. What is the primary role of the experience replay pool in the CERAI algorithm?

a. To store completed tasks for analysis after training.

b. To directly update the Actor and Critic networks after each action.

c. To store state transition tuples for sampling during gradient descent.

d. To track resource allocation history across multiple iterations.

Answer: c

Explanation: The experience replay pool stores state transition tuples that are later

sampled for gradient descent updates, which helps break correlations in training

data.

2. In the DDPG-based resource allocation algorithm, what action does the Actor main

network perform?

a. It selects an action based on the state and random noise.

b. It calculates the reward for the Critic network.

c. It directly updates the allocation record H.

d. It computes the next state for the edge node.

Answer: a

Explanation: The Actor main network uses the current state plus added

random noise to select an action for exploration.

3. How is the cost of collaborative cloud-edge computing calculated in a public cloud

environment?

a. Based solely on the on-demand instance cost.

b. By considering only the computing cost of cloud nodes.

c. By adding the cost of cloud instances (on-demand, reserved, and spot) and the
edge node.

d. By averaging the costs of edge and cloud nodes.

Answer: c

Explanation: In a public cloud setup, the total cost is calculated by summing the

costs of various cloud instance types (on-demand, reserved, spot) together with

the edge node’s cost.

4. What is the role of the Critic network in the Deep Deterministic Policy Gradient (DDPG)

algorithm?

a. To directly perform actions based on the policy.

b. To generate resource allocation policies independently.

c. To store experience in the replay pool.

d. To evaluate the Actor’s performance using a value function.

Answer: d

Explanation: The Critic network evaluates the Actor's performance by estimating

the value function, which guides the Actor’s policy updates.

5. What is the main goal of the resource allocation algorithms in cloud-edge computing?

a. To maximize the number of VMs allocated

b. To minimize the long-term cost of the system

c. To increase the computing time duration

d. To maximize the reward function

Answer: b

Explanation: The goal is to minimize the long-term cost of the system over the T

time slots by minimizing the sum of the costs over all time slots.

6. What are Availability Zones in AWS?

a. Geographic areas where AWS services are available

b. Multiple isolated locations/data centers within a region

c. Edge locations to deliver content to end users

d. Virtual networks defined by customers

Answer: b

Explanation: Availability Zones are defined as multiple isolated locations/data centers

within a region.

7. What is the main difference between PAMDP and MDP?

a. PAMDP has a different reward function

b. PAMDP uses a finite set of parameterized actions


c. PAMDP doesn't use neural networks

d. PAMDP is only used for private cloud environments

Answer: b

Explanation: The difference with the Markov decision process is that A is the finite set

of parameterized actions in PAMDP.

8. What does the Deep Deterministic Policy Gradient (DDPG) algorithm involve?

a. Only Actor networks.

b. Only Critic networks.

c. Both Actor and Critic networks.

d. Neither Actor nor Critic networks.

Answer: c

Explanation: DDPG involves both Actor and Critic networks to guide the

decision-making process in resource allocation.

9. What is the purpose of the experience replay pool in DDPG?

a. To store allocation records.

b. To sample experiences for training the networks.

c. To manage cloud costs.

d. To predict user demand.

Answer: b)

Explanation: The experience replay pool is used to sample experiences for updating

the Actor and Critic networks during training.

10. What is the Markov Decision Process (MDP) used for in resource allocation?

a. To model sequential decision-making problems.

b. To predict user demand.

c. To manage cloud costs.

d. To optimize edge node performance.

Answer: a

Explanation: MDP is used to model the resource allocation problem as a

sequential decision-making process

ASSIGNMENT-VI

1. What happens in a non-FIFO message queue?

A. Messages are processed in the order they are added

B. Messages are processed randomly

C. Messages are deleted before being processed


D. Messages are queued indefinitely

Answer: B. Messages are processed randomly

Explanation: In a non-FIFO (First-In-First-Out) message queue, messages are not processed

in the order they are added; instead, they may be processed randomly or based on priority.

2. Match the following properties with appropriate statements:

Properties:

X: Consistency

Y: Partition-tolerance

Z: Availability

Statements:

1: All nodes see the same data at any time, or reads return the latest written value by any

client

2: The system allows operations all the time, and operations return quickly

3: The system continues to work in spite of network partitions

A. X-1, Y-2, Z-3

B. X-3, Y-2, Z-1

C. X-1, Y-3, Z-2

D. X-3, Y-1, Z-2

Answer: C. X-1, Y-3, Z-2

Explanation: Consistency (X) ensures all nodes see the same data (1), Partition-tolerance (Y)

ensures the system works despite network partitions (3), and Availability (Z) ensures

operations return quickly (2).

3. Which of the following conditions must be satisfied for a global state to be consistent?

A. Messages sent after recording the state must be included in the snapshot.

B. Messages received after recording the state must be excluded from the snapshot.

C. Messages received by a process must have been sent before the snapshot was recorded.

D. Messages sent and received after the snapshot is recorded must be included in the

snapshot.

Answer: C. Messages received by a process must have been sent before the snapshot was

recorded.

Explanation: A consistent global state ensures that only messages sent before the snapshot are

included in the recorded state.

4. What is the primary role of the marker in the Chandy-Lamport algorithm?

A. To separate messages included in the snapshot from those that are not.
B. To identify all incoming messages for a process.

C. To record the state of a process at a specific time.

D. To terminate the distributed snapshot algorithm.

Answer: A. To separate messages included in the snapshot from those that are not.

Explanation: The marker helps distinguish between messages that should be included in the

snapshot and those that should not.

5. What does the total system cost in a cloud-edge computing environment consist of?

A. Only the computation cost of service nodes

B. Only the communication cost of network connections

C. Both the computation cost of service nodes and the communication cost of network

connections

D. The sum of computation cost, communication cost, and storage cost

Answer: C. Both the computation cost of service nodes and the communication cost of

network connections

Explanation: The total system cost in a cloud-edge computing environment includes both the

computation cost (processing tasks on service nodes) and the communication cost

(transmitting data over network connections). These two components are critical for

optimizing workload distribution and minimizing overall costs. Storage cost is not typically

considered a primary component of the total system cost in this context.

6. When tasks are offloaded to a local edge node, the end-to-end service time latency is

determined by the sum of which two delays?

A. Data transmission delay and queuing delay

B. Network delay and computational delay

C. Communication delay and storage delay

D. Scheduling delay and execution delay

Answer: B. Network delay and computational delay

Explanation: End-to-end latency is primarily influenced by network delay (communication)

and computational delay (processing).

7. In the joint LSTM and deep reinforcement learning model for task offloading, what role

does the threshold value in LSTM prediction serve?

A. It sets the maximum processing capacity of the edge server.

B. It decides the communication protocol for task offloading.

C. It balances prediction accuracy and overhead by determining if the predicted task

parameters are acceptable.

D. It chooses between local execution and offloading based solely on energy consumption.
Answer: C. It balances prediction accuracy and overhead by determining if the predicted task

parameters are acceptable.

Explanation: The threshold value ensures that the predicted task parameters are reliable and

acceptable for decision-making.

8. What is the main objective of the DRL algorithm in the context of task offloading based

on LSTM prediction?

A. To minimize the number of offloaded tasks

B. To increase energy consumption for faster processing

C. To maximize the total long-term reward by optimizing task scheduling decisions

(balancing delay, energy, and task drop rate).

D. To determine the physical location of IoT devices

Answer: C. To maximize the total long-term reward by optimizing task scheduling decisions

(balancing delay, energy, and task drop rate).

Explanation: The DRL algorithm aims to optimize task scheduling by balancing delay,

energy, and task drop rate for long-term rewards.

9. Which of the following best describes horizontal offloading in a cloud-edge computing

environment?

A. Transferring workloads from edge nodes to the cloud

B. Transferring tasks between edge nodes to balance load and reduce latency

C. Offloading tasks from mobile devices to a nearby cloudlet

D. Moving computation from a central data center to a remote cloud

Answer: B. Transferring tasks between edge nodes to balance load and reduce latency

Explanation: Horizontal offloading involves distributing tasks among edge nodes to optimize

load and latency.

10. In the Chandy-Lamport algorithm for recording a global snapshot in distributed systems,

what is the primary purpose of the marker message?

A. To trigger the execution of local processes

B. To reset the state of the distributed system

C. To separate messages that should be included in the snapshot from those that should not

D. To confirm the termination of the distributed computation

Answer: C. To separate messages that should be included in the snapshot from those that

should not

Explanation: The marker message helps distinguish between messages that belong to the

snapshot and those that do not.

ASSIGNMENT-VII
1. Which of the following best describes Spark Streaming?

A) A real-time stream processing framework that uses micro-batches

B) A traditional batch processing engine

C) A relational database system

D) A machine learning library

Answer: A.

Spark Streaming breaks the live data stream into small batches for near-real-time

processing.

2. In Spark Streaming, what is a DStream?

A) A static dataset stored on disk

B) A sequence of RDDs representing a continuous data stream

C) A batch file processing tool

D) A database connector

Answer: B.

A DStream is an abstraction that represents a continuous stream as a series of RDDs.

3. What is a major limitation of traditional stream processing systems?

A) Lack of integration with batch processing

B) Inability to process large data streams

C) High processing latencies

D) Limited support for machine learning queries

Answer: A

Traditional systems operated in isolation, forcing separate setups for real-time

and batch workloads.

4. In Kafka’s architecture, which component is responsible for storing, replicating, and

delivering messages?

A) Producers

B) Brokers

C) Consumers

D) Topics

Answer: B.

Brokers are the Kafka servers that store messages and manage replication across partitions.

5. Which communication pattern does MQTT primarily use?

A) Request/response

B) Publish/subscribe

C) Peer-to-peer
D) Client polling

Answer: B.

MQTT is designed as a lightweight publish/subscribe protocol ideal for unreliable IoT

networks

6. What is a major advantage of edge data centers compared to centralized data centers?

A) They increase latency due to remote processing

B) They provide lower latency by processing data closer to the source

C) They require more bandwidth for data transmission

D) They centralize all processing in one location

Answer: B.

Edge data centers reduce latency by bringing computing resources closer to end users and

devices.

7. According to the CAP theorem, what trade-off must distributed systems make during a

network partition?

A) They can guarantee both consistency and availability

B) They must choose between consistency and availability

C) They can ignore partition tolerance

D) They only need to focus on scalability

Answer: B.

During network partitions, a system can ensure either consistency or availability, but not both

simultaneously.

8. In Cassandra, which consistency level requires a majority of replicas to respond?

A) ONE

B) ANY

C) ALL

D) QUORUM

Answer: D.

QUORUM ensures that more than half of the replicas respond, balancing latency and

consistency.

9. What is the primary function of a Bloom filter in Cassandra?

A) Encrypting data on disk

B) Quickly determining if a key might exist in an SSTable

C) Replicating data across multiple nodes

D) Compressing log files

Answer: B.
Bloom filters help efficiently check key existence to avoid unnecessary disk lookups.

10. Which statement best characterizes a key-value store in the context of IoT edge storage?

A) It is a distributed, schema-less storage system optimized for fast lookups using keys

B) It is a relational database designed for complex joins

C) It is a file system for storing unstructured data without indexing

D) It is a batch processing engine for big data analytics

Answer: A.

Key-value stores provide scalable, low-latency access by using keys to retrieve values, making

them ideal for IoT applications.

ASSIGNMENT-VIII

1. Which AWS IoT layer is responsible for registering and managing devices?

a) Things

b) Cloud

c) Intelligence

d) Device Shadow

Answer: b) Cloud

Explanation: AWS IoT Core, a component of the Cloud layer, manages device registration and
communication.

2. AWS Greengrass primarily provides what capability at the edge?

a) Device registration

b) Bulk device onboarding

c) Local processing and inference

d) Device security audit

Answer: c) Local processing and inference

Explanation: Greengrass enables devices to process data and perform machine learning inference locally.

3. Which AWS IoT component serves as a digital identity or twin of a physical device?

a) Rules Engine

b) Device Shadow

c) Device Gateway

d) Device Registry

Answer: b) Device Shadow

Explanation: Device Shadow maintains a synchronized digital representation of a physical device.

4. Which communication protocols does AWS IoT Core predominantly use?

a) MQTT and HTTP

b) FTP and SMTP


c) SSH and TCP

d) UDP and RTP

Answer: a) MQTT and HTTP

Explanation: AWS IoT Core uses MQTT WebSockets and HTTP for message communication.

5. In AWS IoT architecture, what role does AWS IoT Device Defender primarily fulfil?

a) Device management

b) Data analytics

c) Security monitoring and alerts

d) Device shadow synchronization

Answer: c) Security monitoring and alerts

Explanation: Device Defender identifies and alerts on configuration and security anomalies.

6. In the federated learning approach, what primarily differentiates it from traditional machine

learning?

a) Centralized data processing

b) Decentralized model training

c) Use of supervised learning only

d) High-speed internet dependency

Answer: b) Decentralized model training

Explanation: Federated learning involves training ML models across multiple decentralized nodes without
centralized data pooling.

7. What key challenge is associated with federated learning due to non-IID data distribution?

a) Higher accuracy

b) Easier computation

c) Client drift

d) Reduced data privacy

Answer: c) Client drift

Explanation: Client drift occurs because models trained locally might diverge significantly due to non- IID
data.

8. Autonomous vehicles require edge computing primarily because:

a) It is cheaper than cloud computing.

b) It provides faster real-time processing.

c) It has unlimited storage.

d) It is easier to deploy globally.

Answer: b) It provides faster real-time processing.

Explanation: Edge computing provides real-time processing necessary for quick decision-making in
autonomous vehicles.

9. What is the primary role of the Lambda functions in AWS Greengrass?

a) Managing network security

b) Facilitating device-to-device communication

c) Running local compute tasks triggered by events

d) Registering devices in the cloud

Answer: c) Running local compute tasks triggered by events

Explanation: Lambda functions enable local execution of tasks based on defined triggers and events.

10. Which sensor data is commonly used in autonomous vehicles to provide detailed 3D

representations of surroundings?

a) Camera

b) Radar

c) Lidar

d) Ultrasonic

Answer: c) Lidar

Explanation: Lidar sensors provide high-frequency signals ideal for detailed 3D environmental

mapping in autonomous vehicles.

You might also like