- Introduction
- CI/CD Pipeline Overview
- Branch Strategy
- Workflows
- API Documentation
- Installation
- Usage
- Docker Deployment
- Dependencies
- Contributing
- Contact
This project is a machine learning-powered web application that predicts house prices using a RandomForest regression model. The project implements a comprehensive CI/CD pipeline using GitHub Actions, ensuring code quality, automated testing, and seamless deployment to Docker Hub.
The CI/CD pipeline follows a three-branch strategy with automated workflows:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ dev │───▶│ test │───▶│ main │───▶│ Docker Hub │
│ │ │ │ │ │ │ │
│ Code Quality│ │ Unit Testing│ │ Deployment │ │ Container │
│ Check │ │ Coverage │ │ Email Alert │ │ Registry │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
- Purpose: Feature development and initial code validation
- Triggers:
- Code quality checks with flake8
- Security scanning with bandit
- PEP 8 compliance verification
- Protection: Requires admin approval for merges
- Purpose: Comprehensive testing and validation
- Triggers:
- Automated unit tests
- Integration tests
- Code coverage analysis
- Protection: Requires successful test completion
- Purpose: Production-ready code
- Triggers:
- Docker image build and push to Docker Hub
- Email notifications to administrators
- Security scanning of container images
Trigger: Push to dev
branch or PR to dev
Features:
- Python syntax validation
- flake8 linting with PEP 8 compliance
- Security vulnerability scanning with bandit
- Code complexity analysis
Trigger: Push to test
branch or PR to test
Features:
- Comprehensive unit test execution
- Code coverage reporting
- API endpoint validation
- Flask application startup testing
Trigger: Push to main
branch or merged PR to main
Features:
- Docker image building and optimization
- Multi-tag versioning (latest, branch, SHA)
- Push to Docker Hub registry
- Container security scanning
- Email notifications to administrators
The Flask application provides the following REST API endpoints:
- Endpoint:
GET /health
- Description: Check API status and model availability
- Response:
{
"status": "healthy",
"model_loaded": true
}
- Endpoint:
GET /model/info
- Description: Get trained model details
- Response:
{
"model_type": "RandomForestRegressor",
"features_count": 12,
"status": "trained"
}
- Endpoint:
GET /features
- Description: Get list of required input features
- Response:
{
"features": ["area", "bedrooms", "bathrooms", ...],
"numerical": ["area", "bedrooms", "bathrooms", ...],
"categorical": ["mainroad", "guestroom", ...],
"total_features": 12
}
- Endpoint:
POST /predict
- Description: Predict house price based on features
- Request Body:
{
"features": {
"area": 1500,
"bedrooms": 3,
"bathrooms": 2,
"stories": 2,
"mainroad": "yes",
"guestroom": "no",
"basement": "no",
"hotwaterheating": "no",
"airconditioning": "yes",
"parking": 2,
"prefarea": "yes",
"furnishingstatus": "furnished"
}
}
- Response:
{
"prediction": 4500000.0,
"status": "success"
}
- Endpoint:
POST /retrain
- Description: Retrain the model with current dataset
- Response:
{
"status": "Model retrained successfully",
"metrics": {
"mae": 123.45,
"mse": 456.78,
"r2": 0.89,
"r2_percentage": 89.0
}
}
-
Clone the repository:
git clone https://github.com/Rayyan9477/House-Price-Prediction-Model.git cd House-Price-Prediction-Model
-
Switch to development branch:
git checkout dev
-
Create virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python app.py
-
Run unit tests:
pytest tests/ -v
-
Run tests with coverage:
pytest tests/ --cov=app --cov-report=html
docker build -t house-price-prediction .
docker run -p 5000:5000 house-price-prediction
Create docker-compose.yml
:
version: '3.8'
services:
app:
build: .
ports:
- "5000:5000"
environment:
- FLASK_ENV=production
Run with:
docker-compose up
- Flask 2.3.3: Web framework for API development
- pandas 2.0.3: Data manipulation and analysis
- numpy 1.24.3: Numerical computing
- scikit-learn 1.3.0: Machine learning algorithms
- matplotlib 3.7.2: Data visualization
- seaborn 0.12.2: Statistical data visualization
- pytest 7.4.0: Testing framework
- flake8 6.0.0: Code linting and style checking
- pytest-cov: Code coverage reporting
-
Fork the repository
-
Create feature branch from
dev
:git checkout dev git checkout -b feature/your-feature-name
-
Make changes and commit:
git add . git commit -m "feat: add your feature description"
-
Push changes:
git push origin feature/your-feature-name
-
Create Pull Request to
dev
branch
- dev → test: Feature completion, triggers testing workflow
- test → master: Testing success, triggers deployment workflow
- Admin approval required for all merges
- Follow PEP 8 style guidelines
- Maintain code coverage above 80%
- Add unit tests for new features
- Update documentation for API changes
Configure the following secrets in your GitHub repository:
Secret Name | Description | Example |
---|---|---|
DOCKER_HUB_USERNAME |
Docker Hub username | rayyan9477 |
DOCKER_HUB_ACCESS_TOKEN |
Docker Hub access token | dckr_pat_... |
EMAIL_USERNAME |
SMTP email username | [email protected] |
EMAIL_PASSWORD |
SMTP email app password | app-specific-password |
House-Price-Prediction-Model/
├── .github/
│ └── workflows/
│ ├── code-quality.yml
│ ├── testing.yml
│ └── deploy.yml
├── tests/
│ ├── __init__.py
│ └── test_app.py
├── app.py # Flask application
├── House_dataset.csv # Training dataset
├── requirements.txt # Python dependencies
├── Dockerfile # Container configuration
├── .dockerignore # Docker ignore rules
├── Readme.md # Project documentation
└── LICENSE # License file
Watch the video demonstration of the project:
Click the link to watch demo.
- Email: [email protected]
- GitHub: Rayyan9477
- LinkedIn: Rayyan Ahmed
This project is licensed under the MIT License - see the LICENSE file for details.