Bala Hari - Big Data & AWS
+91 8870154502 | [email protected] | LinkedIn | Chennai, India.
Experience
Comcast Sep 21’ – Present
Development Engineer - 1 (Big Data) Chennai, India.
• Built Data Pipelines and ETL processes for multiple features using Spark, Databricks and AWS Services.
• Transformed the legacy data from On-Premise to AWS Cloud, which reduced 20% of maintenance cost.
• Managed Data Quality by enriching the datasets used by the downstream data applications.
• Created Monitoring & Alerting for data discrepancy in pipelines and infrastructure components to maintain
reliability using AWS CloudWatch and ServiceNow for quick support assistance.
• Developed Infrastructure as Code(IaC) using Terraform to provision and manage resources in AWS.
• Implemented CI/CD best practices using GitHub and Concourse CI, which eliminated the manual deployment
process of 10 minutes at the end of every release.
• Used AWS, Concourse CI, Databricks, GitHub, JIRA, PostgreSQL, Spark, ServiceNow and Terraform.
Agira Technologies Apr 21’ – Sep 21’
Full Stack Developer - Internship Chennai, India.
• Developed both Frontend UI and Backend API for an internal application (Data Portal).
• Implemented Single-Sign-On Authentication using Microsoft Azure Directory for ease-login activity.
• Created Access Management System using AWS API for maintaining the access level logs of the AWS resources.
• Tested the functionality of APIs using Postman.
• Used Amazon Web Services, Angular, ExpressJS, GitHub, NodeJS, PostgreSQL and Postman.
Projects
Credit Card Fraud Detection System
• Historical Card Transactions Dataset has been loaded from local Excel sheet to HDFS to AWS RDS using Sqoop
Export utility and also to Hive-HBase tables.
• Member Score & Member Details data have been loaded to Hive tables.
• Card Lookup table is generated with Card Transactions and Member Data.
• Data arrives via Kafka Topics and post validating in Kafka based on business logic, marked as Genuine/Fraud.
• Used AWS RDS, Airflow, Hive, HBase, Kafka, Sqoop and Spark-Streaming.
Integration with Hadoop
• Ingested all the tables from Oracle Database to HDFS using Sqoop.
• Created Hive External tables on top of datasets.
• Implemented transformations on the tables in Hive and store the processed data in Hive-HBase table.
• Used Hadoop, MapReduce, Sqoop, Hive and HBase.
Technical Skills
Languages: Java, Scala, SQL.
Big Data: Hadoop, Spark, Hive, Databricks, GitHub, Concourse CI and Terraform.
AWS Cloud: Athena, CloudWatch, EMR, EventBridge, Glue, Lambda, RDS, RedShift, Step Function, SNS and S3.
Database: HBase and PostgreSQL.
Achievements
• Solved 500+ coding problems across the platforms (DSA + SQL).
• Recognized as Comcast CHAMP for exceptional contribution, during 2022 Q2.
Education
Bachelor of Engineering (Computer Science) May 2020
Jai Shriram Engineering College - Anna University, Chennai
Coding Profiles
• GeeksforGeeks
• LeetCode