salesproject: Batch processing on AWS

Project Overview

This project migrates an on-premises batch processing system to AWS, addressing challenges in reliability, scalability, and maintainability. The architecture leverages AWS services like S3, Glue, and Redshift Serverless to create a robust, serverless data pipeline. All infrastructure is provisioned using Terraform, ensuring consistency and reproducibility.

Project Structure

The repository contains the following Terraform configuration files:

backend.tf: Configures the S3 backend for storing the Terraform state file.
vpc.tf: Defines the VPC, subnets, Internet Gateway, NAT Gateway, Route Tables, and Security Groups.
iamrole.tf: Creates IAM roles and policies for secure service interactions.
redshift.tf: Provisions Redshift Serverless Workgroup, Namespace, and associated configurations.
glue.tf: Configures Glue jobs, connections, and crawlers for data processing.
providers.tf: Specifies the required Terraform providers and versions.
s3.tf: Creates S3 buckets for raw data, processed data, and backups.
variable.tf: Defines input variables for reusable configurations.
sns.tf: Sets up SNS topics for error notifications.

Key Features

Serverless Architecture: Uses managed services like Redshift Serverless and Glue for scalability eliminating infrastructure management overhead.
Automated Data Pipeline: Glue jobs are scheduled to run hourly to process new data files.
Error Handling: SNS sends email notifications for Glue job failures.
Infrastructure as Code: All resources are deployed using Terraform, ensuring consistency and reproducibility.
Security: Redshift is deployed in a VPC (private subnet), least privilege access, and credentials are stored securely in Secrets Manager.
Cost Optimization: Pay-for-use model with no upfront infrastructure costs

Prerequisites

AWS Account: Ensure you have an active AWS account.
Terraform: Install Terraform on your local machine.
AWS CLI: Configure AWS CLI with your credentials (access key and secret key.

Setup Instructions

Clone the Repository:

git clone https://github.com/HakeemSalaudeen/salesproject-batch-processing-on-AWS.git

Initialize Terraform:
```
terraform init
```
Review Variables: Update the variable.tf file with your specific configurations (Redshift credentials).
Deploy Infrastructure:
```
terraform apply
```
Verify Deployment:
- Check the AWS Management Console to ensure all resources are created.
- Test the data pipeline by uploading a file to the S3 bucket.

Code Quality

Code Linting: All Terraform files are formatted using terraform fmt for consistency.
Best Practices: Follows Terraform best practices for modularity and readability.

Monitoring and Maintenance

CloudWatch: Use CloudWatch to monitor Glue jobs, Redshift performance, and system logs.
SNS Alerts: Configure SNS topics to receive notifications for failures.

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Submit a pull request with a detailed description of your changes.

Happy Coding! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
python files		python files
salesproject-job		salesproject-job
terraform		terraform
.gitignore		.gitignore
README.md		README.md
achitecture-diagram.jpg		achitecture-diagram.jpg
salesproject.py		salesproject.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

salesproject: Batch processing on AWS

Project Overview

Project Structure

Key Features

Prerequisites

Setup Instructions

Code Quality

Monitoring and Maintenance

Contributing

About

Uh oh!

Releases

Packages

Languages

HakeemSalaudeen/salesproject-batch-processing-on-AWS

Folders and files

Latest commit

History

Repository files navigation

salesproject: Batch processing on AWS

Project Overview

Project Structure

Key Features

Prerequisites

Setup Instructions

Code Quality

Monitoring and Maintenance

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages