Review – 2
This review should focus on assessing their progress, implementation, and
adherence to DevOps principles.
Evaluation rubric - (70% Completion Status) - Carries 8 Marks:
Marks
Criteria Evaluation Parameters
(Out of 8)
Project Progress & - 70% completion achieved as per plan. - Working functionalities
2 Marks
Implementation demonstrated.
Version Control & - Proper use of Git (commit history, branching). - CI/CD pipeline
2 Marks
CI/CD Integration implemented and functional.
Infrastructure & - Infrastructure as Code (IaC) implemented using
1.5 Marks
Deployment Terraform/Ansible. - Deployment to cloud/on-prem environment.
- Monitoring tools integrated (Grafana, Prometheus, ELK). -
Monitoring, Logging &
1.5 Marks Security measures (SonarQube, OWASP ZAP, Trivy)
Security
implemented.
Presentation & - Clear and well-structured presentation. - Proper documentation
1 Mark
Documentation submitted (progress report, diagrams, test cases).
DEVOPS MINOR PROJECT REVIEW 2 REPORT (70% Completion Status) –
Carries 2Marks
Date: [21/03/2025]
Project Title: [Enter Project Title]
Faculty Mentor: [Enter Mentor Name]
Team Members:
1. Aditya vijay_RA2211027010127
2.
1. Project Overview
● Problem StatementMany businesses struggle to predict and
prevent customer churn because traditional methods are not effective.
Losing customers can negatively impact revenue and growth. To solve
this problem, we need a smart system that can accurately identify
customers who might stop using a service. This project aims to use
machine learning to analyze customer behavior and predict churn in
advance. Additionally, DevOps practices like continuous integration and
monitoring will help keep the system updated and running efficiently. By
taking timely actions based on these predictions, businesses can improve
customer retention and reduce losses.
● Objective
• Improve Accuracy – Overcome the limitations of traditional
methods with a data-driven approach.
• Predict Customer Churn – Use machine learning to identify
customers who are likely to stop using a service.
• Enable Timely Actions – Provide early warnings so businesses
can take steps to retain customers.
• Ensure Scalability – Develop a system that works efficiently for
businesses of any size.
• Integrate DevOps – Use continuous integration and monitoring
to keep the system updated and effective.
● Scope
Key Functionalities
1. Data Analysis & Machine Learning
Perform Exploratory Data Analysis (EDA) to identify churn patterns.
Build a predictive model using XGBoost to classify customers as "churn"
or "non-churn."
Optimize model performance using feature engineering and
hyperparameter tuning.
2. DevOps Integration
CI/CD Pipeline: Automate development, testing, and deployment using
GitHub Actions.
Containerization: Use Docker to ensure consistency across different
environments.
Infrastructure as Code (IaC): Deploy infrastructure using Terraform to
automate AWS resource provisioning.
3. Cloud Deployment & Scalability
Deploy the model as a REST API using FastAPI or Flask on AWS EC2 or a
local server.
4. Monitoring & Logging
Implement Prometheus and Grafana for real-time monitoring of API
performance.
Set up logging and alerts for tracking errors and system failures.
2. Project Progress
Planned Actual Status (Completed/In
Task
Completion Completion Progress/Pending)
Feature
100% 80% In Progress
Implementation
CI/CD Pipeline
100% 100% Completed
Integration
Infrastructure Setup 100% 100% Completed
Security
100% 100% Completed
Implementation
Monitoring &
100% 90% In Progress
Logging
3. DevOps Implementation Details
3.1 Version Control & Collaboration
● Repository Link: https://github.com/ts7000/Customer-Churn-Analysis
Strategy:
Branching Strategy
• Main Branch (main) → Production-ready, stable code.
• Development Branch (develop) → Ongoing development happens
here.
• Feature Branches (feature/*) → Used for specific improvements:
• feature/improve-ml-model → ML model tuning
• feature/api-enhancement → Flask API updates
• feature/devops-integration → Adding Terraform, Airflow
• feature/ui-improvements → Improving the front-end
● Pull Requests & Merge Strategy
Merging Rules:
• Feature branches → develop (via Squash & Merge) to keep commit
history clean.
• develop → main (via Merge Commit) only after passing all tests.
• Hotfixes should be merged into both main and develop to keep things
in sync.
Pull Request Process:
1. Open a PR from feature/* to develop.
2. Code Review – Check model accuracy, API security, Terraform
scripts.
3. Automated Tests (GitHub Actions, Jenkins, or Airflow DAG
runs):
• ML Model Evaluation (accuracy, precision, recall).
• Security Scans (SonarQube, OWASP ZAP).
• Infrastructure Validation (Terraform Plan).
4. If everything passes, merge into develop.
5. When stable, merge develop → main → Deploy to production.
3.2 CI/CD Pipeline Implementation
● CI/CD Tool Used
• GitHub Actions (for automating build, test, and deployment)
• Render (for deployment of the Flask app)
● Pipeline Workflow
The pipeline follows these stages:
1.Build Stage:
• Install dependencies from requirements.txt
• Install the correct Python version (3.11 as per your setup).
• Use Python virtual environments (venv) to isolate dependencies
and avoid conflicts.
2. Test Stage:
• Run Unit Tests (using pytest)
• Perform Integration Tests to validate API responses
• Check for vulnerabilities using Gitleaks (for secrets scanning)
and Syft
(for SBOM generation)
3. Deploy Stage:
• Push changes to GitHub
• Render automatically deploys the updated Flask application
• Post-deployment, Grafana + Prometheus monitors
application performance
● Automated Tests
Automated Tests:
• Unit Tests: Validate individual functions in app.py
• Integration Tests: Test API endpoints with sample JSON requests
• Security Scans:
•Gitleaks (check for exposed secrets)
•Syft (for SBOM generation)
3.3 Infrastructure as Code (IaC)
● Tools Used: Terraform, Docker
● Deployment Environment: Render (On-Premises PaaS)
● Infrastructure Configuration
• Infrastructure Setup:
• Application containerized using Docker
• Infrastructure managed using Terraform
• Environment variables configured securely in Render
• Auto-scaling enabled for handling high traffic
• Monitoring & Security:
• Logs and metrics collected using Prometheus and Grafana
• Security scans integrated with Trivy and Gitleaks
• Role-based access control for API authentication
3.4 Monitoring & Logging
● Monitoring Tools:
• Prometheus – Collects and stores metrics, monitors system health, and
provides alerts.
• Grafana – Visualizes Prometheus metrics, creating dashboards for
application and infrastructure monitoring.
Logging Setup:
1. Application Logs
• Flask automatically generates logs for API requests, errors, and
warnings.
• We can enhance this by using Python’s logging module to log
prediction requests and responses.
2. Centralized Logging (Recommended for Production)
• We can deploy on Render, logs are accessible through Render’s
built-in logging UI.
3. Error Logging & Debugging
• Flask’s built-in error logs help debug issues when predictions fail
tracking.
4. Metrics-Based Logging
• Prometheus collects metrics, but you need to configure it to log
model response times, request counts, etc.
• Example: Log when a model prediction request is made and
track response time.
3.5 Security & DevSecOps
● Security Tools Used: Gitleaks, Syf
Secret Scanning GitLeaks Prevents exposure of secrets.
Dependency Scanning Syft & Grype Detects vulnerabilities in
requirements.txt.
Firewall ProtectionmacOS Firewall Blocks unauthorized connections.
● Compliance Checks
1. Secure Coding Practices
• Use proper input validation and sanitization for API requests.
2. Authentication & Authorization
• Use JWT-based authentication to secure API endpoints.
• Restrict access to model inference API to prevent unauthorized
use.
3. Dependency Security
• Use Trivy to scan for vulnerabilities in dependencies (Flask,
XGBoost, etc.).
• Regularly update your requirements.txt to patch security issues.
4. API Security
• Ensure HTTPS is enforced when making API requests.
• Implement rate limiting to prevent abuse of the prediction API.
5. Logging & Monitoring for Security Events
• Prometheus & Grafana monitor API traffic and system
performance.
• Set up alerts for high error rates or unauthorized access
attempts.
4. Challenges & Solutions
Challenge Faced Solution Implemented
Data Imbalance – The dataset
Used SMOTE (Synthetic Minority
had significantly fewer
Over-sampling Technique) to balance
churned customers compared
the dataset and improve prediction
to non-churned customers,
reliability.
affecting model accuracy.
Model Deployment Issues – Downgraded XGBoost to a stable
Compatibility problems with version and ensured consistent
XGBoost versioning during dependency management using a virtual
deployment. environment.
Security & Monitoring – Integrated Prometheus & Grafana for
Lack of logging and real-time monitoring and set up alerts
monitoring to track model for high error rates or unauthorized
performance and API usage. access attempts.
5. Next Steps & Pending Tasks
● [Task 1] – Expected Completion: 10/04/2025
● [Task 2] – Expected Completion: 10/04/2025
● [Task 3] – Expected Completion: 10/04/2025
6. Conclusion & Learnings
● Key Takeaways
• Importance of Data Preprocessing – Handling missing values, feature
scaling, and balancing datasets significantly improves model
performance.
• Model Selection Matters – Comparing multiple models (Logistic
Regression, Decision Tree, Random Forest, XGBoost, etc.) helped in
selecting the best-performing one.
• DevOps & CI/CD Implementation – Automating deployment pipelines
using Jenkins/GitHub Actions improved efficiency and reduced manual
effort.
• Security is Crucial – Check for vulnerabilities using Gitleaks (for secrets
scanning) and Syft (for SBOM generation).
• Monitoring Improves Reliability – Using Prometheus and Grafana for
real-time monitoring ensured the model performed well post-
deployment.
● Improvements Needed:
• Feature Engineering Optimization – Exploring advanced feature
selection methods (PCA, LASSO regression) to enhance model accuracy.
• Model Explainability – Implementing SHAP values or LIME for better
interpretability of predictions.
• Enhanced Logging Mechanism – Centralizing logs using ELK stack for
better debugging and tracking of model behavior.
7. References & Documentation Links
● GitHub Repository: https://github.com/ts7000/Customer-Churn-Analysis
● CI/CD Pipeline Configuration: https://github.com/ts7000/Customer-Churn-
Analysis/blob/main/backend/.github/workflows/render-deploy.yml
● Infrastructure Setup:
https://github.com/ts7000/Customer-Churn-Analysis/blob/main/backend/Terraform/
main.tf
● Monitoring Dashboard: http://localhost:3000/
PBL-II (Project-Based Learning - II) out of 20 marks:
Marks
Component Description
Allocated
Each practice carries 1 mark (8 × 1 = 8 marks) based
8 Practices 8 Marks
on completion and accuracy.
1 Virtual Lab Evaluated based on execution, accuracy, and
2 Marks
Experiment understanding.
Review 2 (70% Project Based on project progress, documentation, DevOps
10 Marks
Completion) implementation, and presentation.
Review – 2 Schedule
Batch – 1 Schedule
S. No Batch Day order
1 1-5 1
2 6-10 1
3 11-15 3
4 16-20 1
5 21-25 1
6 26-30 3
7 31,32 1
Batch – 2 Schedule
S. No Batch Day order
1 1-5 1
2 6-10 1
3 11-15 3
4 16-20 1
5 21-25 1
6 26-30 3
7 31,32 1
Note: Review 2 Starts from March 18th 2025