Myntra Data Engineer Interview Guide – Experienced 3+
Round 1 - Soft Skills & SQL Basics (Recruiter Assessment)
Overview:
The first round focused on foundational technical skills in SQL, Python, and data modeling,
coupled with an assessment of communication skills.
Detailed Breakdown:
ar
1. Soft Skills Assessment:
ek
The recruiter began with a quick self-introduction:
Focused on presenting your role, tech stack, and recent projects.
Highlighted responsibilities like building pipelines, optimizing data
ad
systems, and integrating tools like Spark and Databricks.
2. SQL Basics:
W
Questions were focused on intermediate SQL concepts. Examples include:
HAVING vs WHERE:
am
Explained how WHERE filters rows before aggregation,
whereas HAVING filters groups post-aggregation.
SELF JOIN Applications:
h
Discussed scenarios like finding manager-employee
ub
relationships within the same table.
WINDOW Functions:
Provided use cases for RANK(), DENSE_RANK() for ordering
Sh
data within partitions.
Indexing:
A True/False question testing the understanding of indexes
and their role in query optimization.
©
3. Data Modeling (2 out of 5 Questions):
Fact vs Dimension Table:
Given a real-world problem, explained how to categorize data into fact
tables (measurable data) and dimension tables (descriptive data).
Slowly Changing Dimension (SCD) Type 4:
Described the hybrid approach of maintaining a current data table and
a historical changes table.
For personal use only. Redistribution or resale is prohibited. © Shubham Wadekar
Tracking Historical Data Changes:
Wrote a SQL query leveraging window functions and timestamps to
identify updates over time.
4. Python Problem:
Problem Statement: Write a Python program to calculate total spending,
identify top 5 users by spending, and find the most purchased product.
Solution Approach:
Used dictionaries and sorting functions to efficiently calculate total
spends.
ar
Applied Counter from Python collections to identify the most frequent
product.
Followed up with explanations around list comprehensions, lambda functions,
ek
and performance optimizations.
ad
Round 2 - Technical Discussion
Overview:
W
This round focused on practical experiences, challenges faced in real-world projects, and
SQL troubleshooting.
am
Key Highlights:
1. Project-Based Discussion:
h
Discussed your previous projects in detail:
ub
Explained the architecture of a data pipeline (e.g., ingestion →
processing → storage → reporting).
Highlighted your role in designing and optimizing Spark jobs for ETL
Sh
processes.
Challenges:
Talked about issues like data skewness and how you resolved them
using techniques like salting and repartitioning.
©
Optimizing Spark jobs with caching, tuning executor memory, and
using broadcast joins.
2. Unsolved Written Test Questions:
The interviewer revisited SQL and Python questions from Round 1:
Provided alternative approaches for solving JOIN-based SQL queries
involving null values.
Demonstrated Python solutions emphasizing code readability and
efficiency.
For personal use only. Redistribution or resale is prohibited. © Shubham Wadekar
3. SQL Problems:
Solved medium-level SQL questions involving two tables with null values.
Required outputs for LEFT JOIN, RIGHT JOIN, and INNER JOIN were
discussed with a clear query structure.
Round 3 - System Design
Overview:
This round tested your understanding of Apache Spark, file formats, and advanced SQL
design scenarios.
ar
Detailed Topics Discussed:
ek
1. Apache Spark Fundamentals:
Core Concepts:
ad
Defined executors, cores, stages, jobs, transformations, and actions in
Spark.
Highlighted how Spark executes tasks in DAG (Directed Acyclic
W
Graph) format.
Optimization:
am
Discussed REPARTITION vs COALESCE for managing partitions.
Explained data skew resolution using partitioning strategies.
2. File Format Comparisons:
h
Delta vs Parquet:
ub
Explained Delta Lake’s ACID transactions, schema enforcement, and
time-travel features.
Compared it to Parquet, highlighting performance trade-offs.
Sh
Z-Ordering:
Provided use cases where Z-Ordering improves query performance
for partitioned Delta tables.
3. Scenario-Based SQL:
©
An advanced SQL problem involving:
Common Table Expressions (CTEs): Used for query modularization.
Conditional Joins: Applied conditional logic for JOIN operations
between two tables.
Focused on both performance optimization and query clarity.
For personal use only. Redistribution or resale is prohibited. © Shubham Wadekar
Round 4 - Hiring Manager Discussion
Overview:
The final round was focused on understanding your personality, motivations, and alignment
with the company’s goals.
Topics Covered:
1. Project Experience:
Discussed major projects, highlighting achievements and challenges.
Explored the usage of tools like Data bricks, Spark, and Delta Lake in your
ar
projects.
2. Career Motivation:
ek
Addressed questions like:
Why are you looking to switch roles?
ad
What excites you about this opportunity?
How does this position align with your long-term career goals?
3. Relocation Questions:
W
The interviewer evaluated flexibility regarding relocation (e.g., moving to
Bangalore while currently settled in Hyderabad).
am
Emphasized your openness to adapt based on career growth opportunities.
Glassdoor Myntra Review –
h
https://www.glassdoor.co.in/Reviews/Myntra-Reviews-E508705.htm
ub
Myntra Careers –
Sh
https://careers.myntra.com/
Subscribe to my YouTube Channel for Free Data Engineering Content –
https://www.youtube.com/@shubhamwadekar27
©
Connect with me here –
https://bento.me/shubhamwadekar
Checkout more Interview Preparation Material on –
https://topmate.io/shubham_wadekar
For personal use only. Redistribution or resale is prohibited. © Shubham Wadekar