Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views8 pages

Data Engineering Masters Program New Curriculum V5

The document outlines a 5-month course focused on Microsoft Azure and PySpark, with a validity of 2 years. It includes a detailed curriculum divided into two milestones covering distributed processing fundamentals, Apache Spark, and Azure cloud technologies, along with practical projects and career support modules. Key topics include Azure Databricks, data lake architecture, and various tools and techniques for big data processing.

Uploaded by

reheki6971
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views8 pages

Data Engineering Masters Program New Curriculum V5

The document outlines a 5-month course focused on Microsoft Azure and PySpark, with a validity of 2 years. It includes a detailed curriculum divided into two milestones covering distributed processing fundamentals, Apache Spark, and Azure cloud technologies, along with practical projects and career support modules. Key topics include Azure Databricks, data lake architecture, and various tools and techniques for big data processing.

Uploaded by

reheki6971
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Course Duration - 5 months | Validity - 2 years

Key Tools and Technologies Covered in this Program

MICROSOFT
AZURE

AZURE DATA PYSPARK & AZURE


FACTORY END-TO-END
PROJECTS
CURRICULUM
Milestone 1 - Distributed Processing Fundamentals &
PySpark
Week 1 : Big Data - The Big Picture
Week 2 : Distributed Storage & Data Lake
Week 3 : Distributed Processing Fundamentals
Week 4 : Apache Spark Core APIs
Week 5 : Spark APIs - Dataframes & Spark SQL
Week 6 : Spark Dataframe Transformations
Week 7 : Apache Spark Caching In-depth
Week 8 : Apache Spark Architecture
Week 9 : Apache Spark Internals
Week 10 : Apache Spark Optimizations
Week 11 : More on Spark Optimizations
Week 12 : GIT GITHUB & CICD
Week 13 : Apache Hive

Milestone-1 Power Modules to Kickstart your Career


Apache Spark Project
Apache Spark Interview Questions
Resume LinkedIn & Naukri Profile Building
Data Structures & Algorithms
Milestone 2 - Azure Cloud
Week 14 : Azure Cloud Fundamentals
Week 15 - 22 : Azure Databricks In-Depth (8 Weeks)
Week 23 - 24 : Azure Data Factory (2 Weeks)

Milestone-2 Power Modules to Kickstart your Career


Azure Cloud Capstone Project
Azure Interview Questions
Overview of Azure Databricks In-Depth (8 Weeks)
Module
→ What is Databricks and Why Databricks
→ Databricks Free Edition vs Azure Databricks
→ High Level Architecture of Databricks
→ Different Cluster Creation Modes
→ Different Types of Tables in Databricks
→ Iceberg Managed Tables in Databricks
→ Magic Commands in Databricks
→ Databricks Utilities
→ Lakehouse Architecture
→ Delta Lake in Depth
→ Volumes
→ Databricks Copy Into
→ Autoloader
→ Lakeflow Declarative Pipelines (Earlier called as DLT)
→ Implementing a Medallion Architecture
→ Governance using Unity Catalog
→ Lakeflow Connect
→ Lakeflow Jobs
→ Deployment – Databricks Asset Bundles
THANK
YOU

You might also like