SINDHUPAARKAVI M.
S, Software Engineer
Phone no: 9789728781
Email:
[email protected] About Myself
Over 7 years of hands-on IT experience in – Python, Big Data and Hadoop technology in
banking & sales domain. Strong knowledge on python, spark, Unix, Big Data ETL process,
related Hadoop technologies and scripting. Proficiency in developing python scripts and
experience in working with hive, HBase, Impala, spark, map reduce and other Hadoop
ecosystem tools.
Career Objective
To achieve high career growth through a continuous learning process and keep myself dynamic,
visionary and competitive with the changing scenario of the world.
Technical skills
Languages known : Python, Spark, C, C++, Unix, SQL, PLSQL,Hadoop ,Cassandra
Scheduling Tools : Control-M,Airflow
Other Tools : Informatica, Enterprise JIRA.
Version Control : Git , Bitbucket, Jira
Platforms : UNIX, Windows, Hadoop , Hive
Databases : MySQL, Oracle 10g., 11g , 12c.
SDLC Process : Agile Software Development with Scrum
Organizational Experience
Company Profile Duration
Tata Consultancy Services IT Analyst 16-JUN-2016 – 5th Aug 2022
Dell Technologies Data Engineer- Advisor 8th Aug 2022 – Till Date
Projects :
1) Tata Consultancy Services :
Internal Use - Confidential
Datalab / Secure Data Lake
• Project Name : DataLab / Secure Data Lake
• Client :Deutsche Bank
• Period : 12th January – Till date
• Work Location : Bangalore, India
• Technologies Used : Python, Hadoop ,Hive, HDFS , MySQL , Spark , Impala ,
Oracle (10g /11g) ,Hbase .
• Project Description : Objective - Extract trade data from various sources, enrich
the data and make the data available to user for analysis.
Description – DataLab / Secure DataLake is a programme initiated by Deutsche Bank which extracts
variety of data, i.e text files, xml, csv files, xlsx, Json, Avro files from
various sources, such as sfp mailbox, traditional RDBMS sources using scoop into
hadoop platform. This data is staged into hadoop after applying transformations and
business logic using ETL tool Datameer. The enriched data is then loaded into hive
tables which are exposed to users. Users use the data to generate reports and draw
analysis. DataLab is one stop shop for trade and finance data of Deutsche Bank.
Role/Responsibility :
Experience in creation and development of Automations using python which will reduce a
lot of manual work.
Involved in understanding and gathering requirements
Management skills like Resource Management, Job Scheduling , Performance Tuning ,
Benchmarking
Experience in development of scripts in Unix Shell
Experience in deploying infrastructure changes based on changing Business
requirements, technical specifications, and Compliance standards.
Involved in identifying the business-critical scenarios and develop design
Involved in creating python scripts to perform housekeeping activities and performance
activities.
Involved in on- boarding the data and creating ETL pipelines.
Involved in identifying, designing and developing test strategy to perform unit testing and
Involved in developing scripts to read and load complete Avro/Json file and loading the
Avro/Json without any data leakage into Hive & Impala.
Involved in developing scripts to read and load CSV file to RDBMS tables using python
Involved in developing scripts to do checksum validation Using MD5 Algorithm
Involved in developing constraints on Oracle tables in an effective manner in order to load proper
data into tables.
2) Dell Technologies :
Internal Use - Confidential
Understanding the requirement document and creating Baseline for code development
Complete end to end solution design for premier process.
Managing ETL from multiple sources like TD, SQL server, or HDFS to similar destinations.
Enhancement and maintenance of Premier & UxFunnel Premier Cube .
Maintaining code artifacts in Git and automated the ETL’s using Airflow
Built products using Apache Spark, Hive, Python, Scala-spark, Splunk, SQL, Shell scripts, Airflow,
Cassandra, etc as the project demanded.
Creating & maintain the QC’s for Premier Jobs.
Internal Use - Confidential