Thanks to visit codestin.com
Credit goes to github.com

Skip to content

maximus-lee-678/ADMW

Repository files navigation

ADMW - AWS Data Migration Workflow 📦

  • A data migration pipeline built on several AWS services.
  • From flat file to relational database.
  • Data is processed in parallel using AWS Glue and Spark to decrease the time needed to extract, load, and transform data.

Two-Stage Migration Process ⏩

Preliminary Stage

  1. Reads data from a flat file and perform any needed data transformations on the data.
  2. Loads the transformed data into a preliminary table (contains additional columns for status tracking and data integrity checks).
  3. Generates a report of the data migration process.

Preliminary Stage Visual Workflow

Final Stage

  1. Moves data from the preliminary table to the final table (finalised schema with the addition of a source file name column).
  2. Generates a report of the data migration process.
  3. Compresses and stores source file.

Final Stage Visual Workflow

Global Features

  1. A tracking table in RDS instance is continually updated to aid in monitoring of a data migration process.
  2. Report summaries are sent via email using SNS.
  3. Users are notified of any errors occurring during a migration process via email.
  4. A Jupyter notebook is provided to allow developers to create and test both transformation and reconcilation logic easily.

AWS Services Used ⭯

  1. AWS S3 - storage service
  2. AWS Lambda - serverless computing service
  3. AWS Glue - data integration service
  4. AWS RDS - relational database service
  5. AWS SNS - messaging service
  6. AWS Step Functions - orchestration service
  7. AWS CloudWatch - monitoring service
  8. AWS IAM - access control service

Video Demonstration 🎥

ADMW Demo

About

AWS Data Migration Workflow

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages