Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
26 views3 pages

CDE Unit-Wise Assignments

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

CDE Unit-Wise Assignments

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

CDE Unit-wise Assignments

UNIT - 1
1. Describe the evolution of data engineering from its inception to the present day
2. Explain how new technology paradigms have impacted data engineering
practices.
3. Explain how new technology paradigms have impacted data engineering
practices.
4. Compare and contrast ETL and ELT processes, highlighting their advantages
and disadvantages.
5. Evaluate how centralizing data in the cloud can impact an organization’s data
strategy.

UNIT - 2
1. What is Big Data, and why is planning critical when working with Big Data? Discuss the
challenges faced during ETL (Extract, Transform, Load) processes when dealing with
Big Data.
2. Explain the concepts of Scaling Up and Scaling Out in the context of Big Data platforms.
Provide examples of when each approach.
3. Describe the key components of a Big Data platform blueprint and their roles in handling
Big Data.
4. Compare and contrast Batch Processing and Stream Processing within the context of
Lambda Architecture.
5. What is the Kappa Architecture, and how does it serve as an alternative to Lambda
Architecture? Discuss the scenarios where Kappa Architecture might be more suitable.
UNIT - 3

1. Describe the concept of Privacy by Design and provide one example of how it can be
applied in a cloud data engineering context.
2. What is the purpose of JSON Web Tokens (JWT), and how are they used in securing
API requests in cloud services?
3. Explain the difference between cloud and on-premises environments in terms of data
management and security.
4. How does GDPR impact data processing practices in cloud environments? Provide one
specific requirement of GDPR that affects cloud data engineers.
5. List and briefly describe three key features of a Hybrid Cloud environment.

UNIT - 4

1. How does Apache Impala differ from Hive in terms of query performance and use
cases?
2. Write a SQL query to update the salary of an employee with employee_id 12345
to $60,000 in the employees table.
3. Describe the advantages of using Spark DataFrames over RDDs for data
processing tasks.
4. What is a KeyValue store in NoSQL databases, and how is it different from a
Document store?
5. Explain the concept of MapReduce and its role in processing large datasets.
Provide a simple example of a MapReduce job.
UNIT - 5

1. List and briefly describe any two popular Business Intelligence


(BI) tools. How are these tools beneficial for data visualization?
2. Explain the importance of APIs in mobile applications for data
visualization. How does an API facilitate interaction between
mobile apps and back-end data services?
3. Imagine you need to display data insights on a web platform using
NodeRED or TomCat. Describe the steps you would take to
configure the server to display interactive dashboards.
4. What is a Digital Twin, and how does it support identity and device
management in modern systems? Provide one example where a
Digital Twin could enhance operational efficiency.
5. Describe the primary purpose of Active Directory in managing
identities within an organization. How does it help in controlling
access to resources?

You might also like