Name: Gajanan Balasaheb Surywanshi
Mobile: 7721835799
Email :[email protected]
Objective:
To work in learning and challenging environment, utilizing my skills and knowledge to be the best of my abilities
and contribute positively to my personal growth as well as growth of the organization.
Professional Summary:
Data Engineer with over 4 years of experience designing and implementing data solutions. Proficient in
Python, PySpark, SQL, AWS, Pandas. Proven track record of developing scalable data pipelines,
optimizing data processing workflows, and ensuring data quality.
Strong problem-solving skills and a deep Understanding of big data technologies and cloud platforms.
Developed and maintained ETL pipelines using Python and PySpark, handling large-scale data processing
and transformation tasks.
Optimized SQL queries for efficient data retrieval and aggregation, resulting in a 20% reduction in query
execution time.
Designed and implemented data warehouse solutions on AWS,
Developed data integration workflows, enabling seamless data movement across different systems and
platforms.
Conducted data quality assessments and implemented data cleansing and validation procedures to
ensure data accuracy and integrity.
Worked closely with cross-functional teams to understand data requirements and provide technical
solutions for data-driven initiatives.
Professional Experience:
Working in NOWON TECHNOLOGIES PRIVATE LIMITED from January 2021 to till date as a Data Engineer.
Personal Skills:
Positive Attitude, Determined, High Energies.
Hard Working, and Sincere.
Good decision making and analytical skills.
Able to handle people in a very efficient way.
Technical Skill Set:
Programming Languages: Python
Big Data Technologies: PySpark
Databases: SQL
Cloud Platforms: Amazon Web Services (AWS)
Data Processing and Analysis: Pandas, Redshift, Athena
ETL Tools: Apache Airflow, AWS GLUE
Version Control: Git
IDE : Jupyter Notebook
Educational Background/Training and Certifications:
Highest Qualification : BBA(Marketing)
Projects:
1.Title Data Lake Implementation and Analytics
Description Set up a data lake infrastructure on AWS using services like S3, Glue,
and Athena, and perform advanced analytics on the stored data
Period January 2021 to June 2023
Position Data Engineer
Responsibilities Design and configure the data lake architecture
Develop ETL jobs using AWS Glue
Handle data ingestion and transformation
Create optimized data schemas in Athena
Perform complex analytics and data processing with Spark,
Collaborate with stakeholders for data exploration and insights.
Technical Skills Python, SQL, AWS, Apache Spark
2.Title Personalized Learning Pathways with Real-Time Feedback
Description Develop and maintain an end-to-end data pipeline for a personalized learning
platform. This involved extracting, cleaning, and transforming data from
various sources, integrating it into a scalable AWS Redshift data warehouse
using tools like Apache Airflow and Python to ensure high-quality, real-time
data for optimized learning experiences.
Period July 2023 to Present
Position Data Engineer
Responsibilities To extract data from the different data sources
including user interactions, user profiles and feedback.
Build ETL pipelines to integrate and aggregate data from different
sources into a unified data warehouse.
Using tools like Apache Airflow, dbt, or custom
scripts for this purpose.
Implement data cleaning and transformation processes to ensure
data quality.
Use tools like Apache Spark or Pandas for data processing.
Store processed data in a scalable data warehouse (e.g., AWS
Redshift)
optimized for querying.
Technical Skills Python, Pandas, SQL,AWS, Apache Airflow, Redshift
Declaration
I hereby declare that the information furnished above is complete and true to the best of my knowledge.
Place: Pune