Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
309 views5 pages

Databricksmcqsquestionsandanswers

This document contains 15 multiple choice questions about Azure Databricks concepts and operations. It covers topics like Databricks concepts, connecting data sources, streaming data capture, authentication/authorization, and Spark SQL operations on DataFrames like filtering, sorting, joining and sampling. The answers provided explain the reasoning for each multiple choice selection.

Uploaded by

Sonali Manjunath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
309 views5 pages

Databricksmcqsquestionsandanswers

This document contains 15 multiple choice questions about Azure Databricks concepts and operations. It covers topics like Databricks concepts, connecting data sources, streaming data capture, authentication/authorization, and Spark SQL operations on DataFrames like filtering, sorting, joining and sampling. The answers provided explain the reasoning for each multiple choice selection.

Uploaded by

Sonali Manjunath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

1.

 Which one of the following is not a operations that can be


performed using Azure Databricks?
A. It is Apache Spark based analytics platform
B. It helps to extract, transform and load the data
C. Visualization if data is not possible with it
D. All of the above
View Answer
Ans : C

Explanation: Azure Databricks also helps in visualization of data.

2. To which one of the following sources do Azure Databricks connect


for collecting streaming data?
A. Kafka
B. Azure data lake
C. CosmosDB
D. None of the above
View Answer
Ans : A

Explanation: Azure Databricks can be connected with sources like Kafka, Event Hubs for
the purpose of collecting streaming data.

3. Which one of the following is a Databrick concept?


A. Workspace
B. Authentication and authorization
C. Data Management
D. All of the above
View Answer
Ans : D

Explanation: There are mainly 5 categories of Databricks concepts viz Workspace, Data
Management, Computational Management, Model Management and Authentication and
authorization.

4. Which of the following ensures data reliability even after


termination of cluster in Azure Databricks?
A. Databricks Runtime
B. Databricks File System
C. Dashboards
D. Workspace
View Answer
Ans : B

Explanation: DataBricks File System is a distributed file system available on DataBricks


clusters.

5. Choose the correct option with respect to ETL operations of data in


Azure Databricks?
A. For loading of data, data is moved from databricks to data warehouse
B. for loading of data, blob storage is used
C. Blob storage serves as a temporary storage
D. All of the above
View Answer
Ans : D

Explanation: All the above statements are true. Loading is basically to load the data to
SQL Datawarehouse.

6. Which one of the following is incorrect regarding Workspace of


Azure Databricks concept?
A. It manages ETL operations of data
B. It can store notebooks, libraries and dashboards
C. It is the root folder of Azure Databricks
D. None of the above
View Answer
Ans : A

Explanation: ETL i.e Extract, Transform and Load operations come under computational
management.

7. Which of the following Azure datasources can be connected to


Azure Databricks?
A. Azure Blob Storage
B. Azure Datawarehouse
C. Azure CosmosDB
D. All of the above
View Answer
Ans : D

Explanation: Azure Databricks can connect to different datasources which includes all
the above three data sources.

8. Streaming data can be captured by?


A. Kafka
B. Event Hubs
C. Both A and B
D. None of the above
View Answer
Ans : C

Explanation: Both Kafka and Event Hubs are capable of capturing streaming data.

9. Authentication and authorization in databricks can be managed for :


A. User, Group, Access Control List
B. User, Group
C. Access Control List
D. Group, Access Control List
View Answer
Ans : A

Explanation: Azure DataBricks has a benefit that authentication and authorization can be
managed for user, group as well as Access Control List.

10. Which one of the following is a set of components that run on


clusters of Azure Databricks?
A. DataBricks File System
B. Databricks Runtime
C. CosmosDB
D. Azure Data Lake
View Answer
Ans : B

Explanation: Databricks Runtime is built on Apache Spark which is a set of components


that be used to run in databricks.
11 Given a dataframe df, select the code that returns its
number of rows:A. df.take('all')
B. df.collect()
C. df.show()
D. df.count() --> CORRECT
E. df.numRows()

12 iven a DataFrame df that includes a number of columns among


which a column named quantity and a column named price, complete
the code below such that it will create a DataFrame including
all the original columns and a new column revenue defined as
quantity*price:df._1_(_2_ , _3_)A. withColumnRenamed, "revenue",
expr("quantity*price")
B. withColumn, revenue, expr("quantity*price")
C. withColumn, "revenue", expr("quantity*price") --> CORRECT
D. withColumn, expr("quantity*price"), "revenue"
E. withColumnRenamed, "revenue", col("quantity")*col("price")

13. Given a DataFrame df that has some null values in the


column created_date, complete the code below such that it will
sort rows in ascending order based on the column creted_date
with null values appearing last.df._1_(_2_)A. orderBy,
asc_nulls_last("created_date")
B. sort, asc_nulls_last("created_date")
C. orderBy, col("created_date").asc_nulls_last() --> CORRECT
D. orderBy, col("created_date"), ascending=True)
E. orderBy, col("created_date").asc()

14. Which one of the following commands does NOT trigger an


eager evaluation?A. df.collect()
B. df.take()
C. df.show()
D. df.saveAsTable()
E. df.join() --> CORRECT

15 The code below should return a new DataFrame with 50 percent


of random records from DataFrame df without replacement. Choose
the response that correctly fills in the numbered blanks within
the code block to complete this task.df._1_(_2_,_3_,_4_)A.
sample, False, 0.5, 5 --> CORRECT
B. random, False, 0.5, 5
C. sample, False, 5, 25
D. sample, False, 50, 5
E. sample, withoutReplacement, 0.5, 5

You might also like