The basic programming abstraction of Spark Streaming is _.
Dstreams--rgt
Which among the following can act as a data source for Spark Streaming?
All the options--rgt
Dstreams are internally, a collection of _.
RDD--rgt
HDFS cannot be a sink for Spark Streaming.
False--rgt
We cannot configure Twitter as a data source system for Spark Streaming.
False--rgt
Spark Streaming can be used for real-time processing of data.
True--rgt
Dstreams cannot be created directly from sources such as Kafka and Flume.
False--rgt
Internally DStream is represented as a sequence of _ arriving at discrete time
intervals
RDD--rgt
park streaming converts the input data streams into
micro-batches--rgt
Dstreams can be created from an existing Dstream.
True--rgt
How can a Dstream be created?
Both ways--rgt
Block Management units in the worker node reports to
Block Management Master in the Driver--rgt
Choose the correct statement.
All the options--rgt
Block Management Master keeps track of _
Block id--rgt
ssc.start() is the entry point for a Streaming application.
True--rgt
The receiver divides the stream into blocks and keeps them in memory.
True--rgt
Starting point of a streaming application is _.
ssc.start()--rgt
When is a batch interval defined?
creation of Streaming context--rgt
Sliding Interval is the interval at which sliding of the window area occur.
True--rgt
Which among the following needs to be a multiple of batch interval?
All the options--rgt
Which among the following is true about Window Operations?
All the options--rgt
There can be multiple Dstreams in a single window.
True--rgt
What is a Window Duration/Size?
Interval at which a certain fold operation is done on top of Dstreams.--rgt
Internally DStream is represented as a sequence of _ arriving at discrete time
intervals.
RDD--rgt
Spark streaming converts the input data streams into ______.
micro-batches--rgt
Block Management Master keeps track of ___.
Block id--rgt
Block Management units in the worker node reports to ____.
Block Management Master in the Driver--rgt
Which among the following is true about Spark Streaming?
All the options
reduceByKey is a _.
Transformation
With Spark Streaming, the incoming data is split into micro batches.
True--correct
What is the strategy taken in order to prevent loss of the incoming stream?
Data is replicated in different nodes
What does saveAsTextFiles(prefix, [suffix]) do?
Save this DStream's contents as text files--correct
Mllib and Spark SQL can work on top of the data taken up via Spark Streaming.
True--correct
What is a batch Interval?
Interval at which a certain operation is done on top of Dstreams.
Who is responsible for keeping track of the Block Ids?
Block Management Master in the Driver--correct
Which among the following are Basic Sources of Spark Streaming?
Kafka--correct
Which among the following can act as a data sink for Spark Streaming?
All the options
Which of the following transformations can be applied to a Dstream?
All the options--correct
Benefits of Discretized Stream Processing are ___.
All the options
Dstreams are _.
Collection of RDD
What is a Sliding Interval?
Interval at which sliding of the window area occur.
Dstreams are internally _.
Collection of RDD
DStream represents a continuous stream of data.
True--correct
Reciever recieves data from the Streaming sources at the start of _.
Streaming Context
Batch interval is configured at _.
it is 10 Seconds by default