Data Engineer
Position: Data Engineer
Experience: 6 months -1 year
Qualification: Graduate with Computer Science specialization or similar from reputed Institution
Location: Gurgaon (Currently Work from Home till further notice)
We are seeking skilled and motivated Data Engineers to design, develop, and maintain scalable data
pipelines and systems. The ideal candidates will have expertise in data architecture, ETL/ELT
processes, and modern data technologies, ensuring seamless integration and accessibility of data
across the organization. This role requires a blend of technical acumen, analytical skills, and problem-
solving abilities to support the organization’s data-driven initiatives.
Primary Skills : Python, SQL, Spark, Data Lakehouse, ETL, Linux
Secondary Skills : Cloud Infrastructure AWS/Azure, Kafka, CI/CD, Terraform, Databricks
Job Description:
1) Proficiency in working with large datasets, databases, data integration tools.
2) Strong programming skills, especially in Python or Scala.
3) Familiarity with Big Data frameworks and libraries used for data processing, Apache
Spark is strongly preferred.
4) Knowledge of designing, building, maintaining efficient data pipelines and ETL processes.
5) Experience with tools like Apache Kafka, Apache Airflow is preferred.
6) Understanding of data modelling principles, including relational and dimensional data
models.
7) Proficiency in designing and optimizing data architectures for performance and
scalability.
8) Familiarity with distributed computing platforms like Hadoop and Spark, as well as NoSQL
databases, knowledge of cloud platforms
9) Knowledge of data quality assessment, data cleansing techniques, and data governance
best practices to ensure data accuracy, consistency, and compliance is preferred.
10) SQL with experience in cloud provider platforms for building Big Data pipelines and
solutions based on architected plans
11) Implement data services and tools to ingest, egress, and transform data from multiple
sources.
12) Responsible for creating an ETL pipeline with Ecosystem like Azure Data Bricks -
PySpark, Azure Data Factory.
13) Design data ingestion into data modelling services to create cross domain data models
for end user consumption.
14) Implement ETL, related jobs to curate, transform and aggregate data to create source
models for end user analytics use cases.
Soft Skills:
• Strong problem-solving and analytical thinking abilities.
• Excellent communication skills for cross-functional collaboration.