Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
77 views6 pages

Data Engineer

Maheswari Kunapareddy is a Data Engineer with over 9 years of experience in Big Data technologies, cloud services, and database management. She has extensive expertise in developing data pipelines, ETL processes, and reporting solutions using tools like Azure Data Factory, SQL Server, and Power BI. Her professional background includes roles at notable companies such as PPL Limited and Blue Cross Blue Shield, where she has successfully implemented data solutions and analytics strategies.

Uploaded by

brucelee.lee81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views6 pages

Data Engineer

Maheswari Kunapareddy is a Data Engineer with over 9 years of experience in Big Data technologies, cloud services, and database management. She has extensive expertise in developing data pipelines, ETL processes, and reporting solutions using tools like Azure Data Factory, SQL Server, and Power BI. Her professional background includes roles at notable companies such as PPL Limited and Blue Cross Blue Shield, where she has successfully implemented data solutions and analytics strategies.

Uploaded by

brucelee.lee81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Maheswari Kunapareddy

Email: [email protected]
Phone: 518-994-6742
________________________________________________________________________________________________________

PROFESSIONAL EXPERIENCE

 Around 9+ years of experience in Big Data technologies, Data Pipelines, SQL/NoSQL, Cloud based RDS, Distributed
Database, Server less Architecture, Data Mining, Web Scrapping, Cloud technologies like AWS EMR, Redshift,
Lambda, Step Functions, Cloud Watch.
 Hands on experience using Hadoop tools like HDFS, Hive, Apache Spark, Apache Sqoop, Flume, Oozie, Apache
Kafka, Apache storm, Yarn, Impala, Zookeeper, Hue.
 Experience on Migrating SQL database to Azure Data Lake, Azure data lake Analytics, Azure SQL Database, Data
Bricks and Azure SQL Data warehouse and controlling and granting database access and Migrating On premise
databases to Azure Data Lake store using Azure Data factory.
 Expertise in developing and testing MS SQL Server with Business intelligence in SQL Server Integration Services
 (SSIS).
 Experienced in Logical and Physical Database design and development, Normalization and Data modelling using
Erwin and SQL Server Enterprise Manager.
 Expert in developing SSIS Packages to Extract, Transform and Load (ETL) data using various type of control flow,
data flow, event handling and logging.
 Extensive hands-on experience in several Data Flow transformations including Derived-Column, Script, Slowly
Changing Dimension, Look up, Data Conversion, Conditional-Split techniques, Merge, Multicast, Union All.
 Experience in report building and creating various types of reports like Table, Matrix, Chart, Drill down, Drill
through, sub reports and ad-hoc reports.
 Experience in designing dashboards and reports, parameterized reports, predictive analysis in Power BI.
 Hands on experience in deploying, scheduling, and subscribing SSRS reports using report manager.
 Expert in database modelling, dimension, and fact design, both star and snowflake schema design.
 Hands on experience in creating jobs, alerts, SQL Mail Agent, Database Mail and Scheduled DTS and SSIS
Packages using SQL server agent and Active Batch.
 Knowledge on C#, VB.net, Windows scripts, SQL Scripts, POSTGRES, SQL Server, Oracle, DB2, Teradata, Pl/SQL, Ultra
Edit, Putty, Data Modelling, SSAS and Cognos.
 Experience on Migrating SQL database to Azure Data Lake, Azure data lake Analytics, Azure SQL Database, Data
Bricks and Azure SQL Data warehouse and controlling and granting database access and Migrating On premise
databases to Azure Data Lake store using Azure Data factory.
 Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation,
and aggregation from multiple file formats for analysing & transforming the data to uncover insights into the
customer usage patterns.
 Designed and developed Power BI graphical and visualization solutions with business requirement documents and
plans for creating interactive dashboards.

TECHNICAL SKILLS
Languages

C, .NET, Java, SQL, T-SQL, and U-SQL.

BI Tools

SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), Power BI,
Data Transformation Services (DTS), Visio, Erwin.

Databases

SQL Server 2016,2014,2012,2008 R2, Azure SQL DB, MS Access, Oracle 10g/11g.

Cloud technologies

Azure Data lake, Data factory, Azure Databricks, Azure SQL database, Azure SQL Datawarehouse.

PROJECT EXPERIENCE
PPL Limited – Allentown, PA

Data Engineer

Jun’22 – Present

Description: PPL Corporation is a high performing utility, electric and more dynamic power grids and advancing
sustainable company.

Responsibilities:

 Involved in Analyze, Design and Build Modern Data Pipeline based on business requirements and
provided documentation.
 Extract Transform from source system to Azure Data storage
 services using a combination of Azure Data Factory, Spark SQL, Azure Data Lake Analytics and processing
the data in Azure Databricks.
 Created Pipelines in ADF using Linked Services, Datasets, Pipelines to extract, transform and load data
different sources like Azure SQL, Blob storage, Azure SQL Data warehouse.
 Expertise in Creating, Debugging, Scheduling and Monitoring jobs using ADL.
 Created data bricks notebooks using Python (PySpark), Scala and Spark SQL for transforming the data
that is stored in Azure Data Lake stored from Raw to Stage and Curated layers.
 Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark data bricks cluster.
 Hands-on experience on developing SQL Scripts for automation purpose.
 Created Build and release for multiple projects(modules) in production environment using visual studio.
 Worked closely with the business analysts to convert the Business Requirements into Technical
Requirements and preparing low and high-level documentation.
 Configured and implemented the Azure Data Factory Triggers and scheduled the Pipelines and
monitored the scheduled Azure Data Factory pipelines and configured the alerts to get notification of
failure pipelines.
 Extensively worked on Azure Data Lake Analytics with the help of Azure Data bricks to implement SCD-1,
SCD-2 approaches.
 Created Azure Stream Analytics Jobs to replication the real time data to load to Azure SQL Data warehouse.
 Implemented delta logic extractions for various sources with the help of control table; implemented the
Data Frameworks to handle the deadlocks, recovery, logging the data of pipelines.
 Developing Spark (Scala) notebooks to transform and partition the data and organize files in ADLS.
 Working on Azure Data bricks to run Spark-Python Notebooks through ADF pipelines.
 Using Data bricks utilities called widgets to pass parameters on run time from ADF to Data bricks.
 Created Triggers, PowerShell scripts and the parameter JSON files for the deployments
 Worked with VSTS for the CI/CD Implementation.
 Reviewing individual work on ingesting data into azure data lake and provide feedbacks based on
reference architecture, naming conventions, guidelines, and best practices.
 Implemented End-End logging frameworks for Data factory pipelines.

Environment: Azure Data Factory, Azure Data Bricks, Pyspark, ADLS, Azure DevOps, BLOB, Azure SQL Server, Azure DW,
Azure Devops, Azure synapse.

Blue Cross Blue Shield – Dallas, TX

Data Engineer

Jan’20 – Feb’22
Description: Blue Cross and Blue Shield of Texas (BCBSTX), is a health care coverage in Texas and focuses on the health
and wellness of its members.

Responsibilities:

 Involved in Analysis, Planning and Defining data based on business requirements and provided documentation.
 Worked closely with the business analysts to convert the Business Requirements into Technical Requirements and
preparing low and high-level documentation.

 Performing transformations using Hive, MapReduce, hands on experience in copying .log, snappy files into HDFS
from Greenplum using Flume & Kafka, loaded data into HDFS and extracted the data into HDFS from MYSQL using
Sqoop.
 Imported required tables from RDBMS to HDFS using Sqoop and used Storm/ Spark streaming and Kafka to get real
time streaming of data into HBase.
 Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive
and AWS cloud.
 Used AWS Redshift, S3, Spectrum and Athena services to query large amount data stored on S3 to create a Virtual
Data Lake without having to go through ETL process.
 Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3,
ORC/Parquet/Text Files into AWS Redshift.
 Developed views and templates with Python and Django's view controller and templating language to create a user-
friendly website interface.
 Experience in Writing Map Reduce jobs for text mining and worked with predictive analysis team and Experience in
working with Hadoop components such as HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Oozie, Impala
and Flume.
 Wrote HIVE UDF's as per requirements and to handle different schema’s and xml data.
 Implemented ETL code to load data from multiple sources into HDFS using Pig Scripts.
 Developed data pipeline using Python, Hive to load data into data link. Perform data analysis data mapping for
several data sources.
 Loaded data into S3 buckets using AWS Glue and PySpark. Involved in filtering data stored in S3 buckets using
Elasticsearch and loaded data into Hive external tables.
 Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and
databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB.
 Designed new Member and Provider booking system which allows providers to book new slots, with sending out the
member leg and provider Leg directly to TP through DataLink.
 Open SSH tunnel to Google DataProc to access to yarn manager to monitor spark jobs.
 Analyze various type of raw file like Json, Csv, Xml with Python using Pandas, NumPy etc.
 Worked with Play framework and Akka parallel processing.
 Developed Spark applications using Scala for easy Hadoop transitions. And Hands on experienced in writing
Spark jobs and Spark streaming API using Scala and Python.
 Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive developed Spark code and
Spark-SQL/Streaming for faster testing and processing of data.
 Installed Oozie workflow engine to run multiple Hive and Pig jobs.
 Designed and developed User Defined Function (UDF) for Hive and Developed the Pig UDF’S to pre-process the
data for analysis as well as experience in (UDAFs) for custom data specific processing.
 Automated the existing scripts for performance calculations using scheduling tools like Airflow.
 Designed and developed the core data pipeline code, involving work in Java and Python and built on Kafka and
 Storm.
 Good knowledge and experience on Partitions, bucketing concepts in Hive and designed both Managed and
 External tables in Hive for optimized performance.
 Performance tuning using Partitioning, bucketing of IMPALA tables.
 Created cloud-based software solutions written in Scala Spray IO, Akka, and Slick.
 Hands on experience on fetching the live stream data from DB2 to HBase table using Spark Streaming and
 Apache Kafka.
 Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
 Worked on NoSQL databases including HBase and Cassandra.
 Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.

Environment: Map Reduce, HDFS, Hive, Pig, HBase, Python, SQL, Sqoop, Flume, Oozie, Impala, Scala, Spark, Apache Kafka,
Play, AWS, AKKA, Zookeeper, J2EE, Linux Red Hat, HP-ALM, Eclipse, Cassandra, SSIS.

Change Healthcare - Chicago, IL

Data Engineer

Jan’19 to Dec’19

Description: Change Healthcare is a provider of revenue and payment cycle management and clinical information
exchange solutions, connecting payers, providers, and patients in the U.S. healthcare system.

Responsibilities:

 Analyze, design, and build Modern data solutions using Azure PaaS service to support visualization of
data. Understand current Production state of application and determine the impact of new implementation
on existing business processes.
 Acquire data from primary or secondary data sources and maintain databases/data systems.
 Develop and standardize numerous reports used for audit and compliance for other departments.
 Administer the processes to establish Data governance and audit & compliance requirements.
 Locate and define new process improvement opportunities.
 Design, develop and maintain enterprise Data Warehouse with SSIS packages.
 Convert critical reports to Power BI and deploy on to O365 cloud.
 Worked on Azure Data Factory to move data from SQL Server DB to Azure SQL DB.
 Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination
of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or
more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in
In Azure Databricks.
 Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and
aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the
usage patterns.
 To meet specific business requirements wrote UDF’s in Scala and Pyspark.
 Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using
the SQL Activity.
 Presented a proof of concept on ETL process using python – using PETL, Pandas, and Pyspark libraries.
 Implemented custom visuals, custom JS charting libraries in Power BI using type script and javascript
scripting.

Environment: Azure Data Factory (ADF), Pandas, Pyspark, SQL Server 2014/2017, Visual Studio 2015, ETL, SSIS, Reporting
Services (SSRS), SQL Server Data tools (SSDT), Data warehouse, JIRA, TFS, Git, Azure, Jupiter, DAX, Power BI.

Verizon Communications- Virginia, VA

Data Engineer

Jan’17 – Nov’18

Description: Verizon Communications Inc. is an American multinational telecommunications conglomerate.

Responsibilities:
 Created infrastructure for optimal extraction, transformation, and loading of data from a wide variety of data
sources.
 Designed and created optimal pipeline architecture on Azure platform.
 Created pipelines in Azure using ADF to get the data from different source systems and transform the data by
using many activities.
 Worked on mapping data flows activities in the Azure data Factory.
 Created Linked service to land the data from different sources to Azure Data Factory.
 Worked on SQL Server Integration Services (SSIS)packages with Azure-SSIS integration runtime to run SSIS
packages in ADF.
 Implemented authentication mechanism using Azure Active Directory for data access and ADF.
 Created different types of triggers to automate the pipeline in ADF.
 Created, provisioned different databricks, clusters needed for batch and continuous streaming data processing and
installed the required libraries for the clusters.
 Involved in Migrating Objects from Teradata to Snowflake.

 Created several Data bricks Spark jobs with Pyspark to perform several tables to table operations.
 Develop Azure SQL Data Warehouse SQL scripts with Polybase support and processing files stored in Azure
Storage, Azure Data Lake.
 Create U-SQL scripts and run in Azure Data Lake Analytics Jobs to process the files in Data Lake.
 Work on SQL Scripts, T-SQL Stored procedures, triggers, queries, packages to load data in SQL Server, and SQL
 Data warehouse.

Environment: Azure SQL Server, Azure Data Warehouse, Azure Storage, SSIS, Azure Data Lake, Azure Data Lake
Analytics, Azure Data Factory, Logic Apps, Function Apps, Event Hubs, Event Grids, SQL Server, Visual Studio.

Max Kelly Service - Hyderabad, India

BI Developer

Jul’14 – Jul’16

Description: Max Kelly is a staffing services and solutions firm that specializes in uniting flexible and direct hire
professionals in technology especially in the areas of applications, infrastructure, end user support and communications
technologies.

Responsibilities:

 Collaborated with system administration team on improvements regarding disk configuration, RAID level
according to database sizing and capacity planning.
 Actively involved in installing SQL Server 2008, maintenance, and performance tuning.
 Monitored server performance with Windows Performance Monitor, SQL Profiler, Query Analyzer,
Execution Plan, Server Trace, Client Statistics, or system stored procedures.
 Checked data file allocation, index, fill factor, and fragmentation.
 Maintained data consistency and synchronization with other branches using Replication.
 Database design and development using SQL Server 2005 management studio.
 Designed and implemented Microsoft Business Intelligence in SQL Server 2008.
 Migrated some DTS packages of SQL Server 2005 to 2008 to make them reused in new environment.
 Created SSIS packages to achieve data clearance and load data into desired destination with preferred format.
 Created SSIS packages to map data sources specifically to the data destination locations for an OLAP cube or
other usage and to fulfill the SQL Server maintenance tasks including rebuilding index, backing up database
and updating statistics.
 Built SSRS project to generate reports with desired forms (Webpage, PDF, Excel etc.) to present statistics
information according to business requirement.
 Created data models with Report Builder to provide a template for end-users for creating reports by
themselves with pre-selected group of relational tables.
 Implemented project using Agile methodology.

Environment: SQL Server 2008, SQL Server 2005, DTS, MS Excel, SSIS 2008 R2, SSRS 2008 R2, OLE DB, SQL Server
2008/2005, SharePoint 2010.

You might also like