Thanks to visit codestin.com
Credit goes to github.com

Skip to content

This project aims to predict house prices using a machine learning model. The project involves data cleaning, feature engineering, model selection, training, and evaluation. The dataset is uploaded by the user, and the model is trained to predict house prices based on various features.

License

Notifications You must be signed in to change notification settings

Rayyan9477/House-Price-Prediction-Model

Repository files navigation

House Price Prediction CI/CD Pipeline

Code Quality Testing Deploy

Table of Contents

Introduction

This project is a machine learning-powered web application that predicts house prices using a RandomForest regression model. The project implements a comprehensive CI/CD pipeline using GitHub Actions, ensuring code quality, automated testing, and seamless deployment to Docker Hub.

CI/CD Pipeline Overview

Pipeline Architecture

The CI/CD pipeline follows a three-branch strategy with automated workflows:

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│     dev     │───▶│    test     │───▶│    main     │───▶│ Docker Hub  │
│             │    │             │    │             │    │             │
│ Code Quality│    │ Unit Testing│    │ Deployment  │    │ Container   │
│ Check       │    │ Coverage    │    │ Email Alert │    │ Registry    │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘

Branch Strategy

1. Development Branch (dev)

  • Purpose: Feature development and initial code validation
  • Triggers:
    • Code quality checks with flake8
    • Security scanning with bandit
    • PEP 8 compliance verification
  • Protection: Requires admin approval for merges

2. Test Branch (test)

  • Purpose: Comprehensive testing and validation
  • Triggers:
    • Automated unit tests
    • Integration tests
    • Code coverage analysis
  • Protection: Requires successful test completion

3. Main Branch (main)

  • Purpose: Production-ready code
  • Triggers:
    • Docker image build and push to Docker Hub
    • Email notifications to administrators
    • Security scanning of container images

Workflows

1. Code Quality Workflow (.github/workflows/code-quality.yml)

Trigger: Push to dev branch or PR to dev

Features:

  • Python syntax validation
  • flake8 linting with PEP 8 compliance
  • Security vulnerability scanning with bandit
  • Code complexity analysis

2. Testing Workflow (.github/workflows/testing.yml)

Trigger: Push to test branch or PR to test

Features:

  • Comprehensive unit test execution
  • Code coverage reporting
  • API endpoint validation
  • Flask application startup testing

3. Deployment Workflow (.github/workflows/deploy.yml)

Trigger: Push to main branch or merged PR to main

Features:

  • Docker image building and optimization
  • Multi-tag versioning (latest, branch, SHA)
  • Push to Docker Hub registry
  • Container security scanning
  • Email notifications to administrators

API Documentation

The Flask application provides the following REST API endpoints:

Base URL: http://localhost:5000

1. Health Check

  • Endpoint: GET /health
  • Description: Check API status and model availability
  • Response:
{
  "status": "healthy",
  "model_loaded": true
}

2. Model Information

  • Endpoint: GET /model/info
  • Description: Get trained model details
  • Response:
{
  "model_type": "RandomForestRegressor",
  "features_count": 12,
  "status": "trained"
}

3. Feature Information

  • Endpoint: GET /features
  • Description: Get list of required input features
  • Response:
{
  "features": ["area", "bedrooms", "bathrooms", ...],
  "numerical": ["area", "bedrooms", "bathrooms", ...],
  "categorical": ["mainroad", "guestroom", ...],
  "total_features": 12
}

4. Price Prediction

  • Endpoint: POST /predict
  • Description: Predict house price based on features
  • Request Body:
{
  "features": {
    "area": 1500,
    "bedrooms": 3,
    "bathrooms": 2,
    "stories": 2,
    "mainroad": "yes",
    "guestroom": "no",
    "basement": "no",
    "hotwaterheating": "no",
    "airconditioning": "yes",
    "parking": 2,
    "prefarea": "yes",
    "furnishingstatus": "furnished"
  }
}
  • Response:
{
  "prediction": 4500000.0,
  "status": "success"
}

5. Model Retraining

  • Endpoint: POST /retrain
  • Description: Retrain the model with current dataset
  • Response:
{
  "status": "Model retrained successfully",
  "metrics": {
    "mae": 123.45,
    "mse": 456.78,
    "r2": 0.89,
    "r2_percentage": 89.0
  }
}

Installation

Local Development Setup

  1. Clone the repository:

    git clone https://github.com/Rayyan9477/House-Price-Prediction-Model.git
    cd House-Price-Prediction-Model
  2. Switch to development branch:

    git checkout dev
  3. Create virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  4. Install dependencies:

    pip install -r requirements.txt
  5. Run the application:

    python app.py

Testing Setup

  1. Run unit tests:

    pytest tests/ -v
  2. Run tests with coverage:

    pytest tests/ --cov=app --cov-report=html

Docker Deployment

Building Docker Image

docker build -t house-price-prediction .

Running Container

docker run -p 5000:5000 house-price-prediction

Using Docker Compose (Optional)

Create docker-compose.yml:

version: '3.8'
services:
  app:
    build: .
    ports:
      - "5000:5000"
    environment:
      - FLASK_ENV=production

Run with:

docker-compose up

Dependencies

Core Dependencies

  • Flask 2.3.3: Web framework for API development
  • pandas 2.0.3: Data manipulation and analysis
  • numpy 1.24.3: Numerical computing
  • scikit-learn 1.3.0: Machine learning algorithms
  • matplotlib 3.7.2: Data visualization
  • seaborn 0.12.2: Statistical data visualization

Development Dependencies

  • pytest 7.4.0: Testing framework
  • flake8 6.0.0: Code linting and style checking
  • pytest-cov: Code coverage reporting

Contributing

Development Workflow

  1. Fork the repository

  2. Create feature branch from dev:

    git checkout dev
    git checkout -b feature/your-feature-name
  3. Make changes and commit:

    git add .
    git commit -m "feat: add your feature description"
  4. Push changes:

    git push origin feature/your-feature-name
  5. Create Pull Request to dev branch

Pull Request Process

  1. dev → test: Feature completion, triggers testing workflow
  2. test → master: Testing success, triggers deployment workflow
  3. Admin approval required for all merges

Code Standards

  • Follow PEP 8 style guidelines
  • Maintain code coverage above 80%
  • Add unit tests for new features
  • Update documentation for API changes

Required GitHub Secrets

Configure the following secrets in your GitHub repository:

Secret Name Description Example
DOCKER_HUB_USERNAME Docker Hub username rayyan9477
DOCKER_HUB_ACCESS_TOKEN Docker Hub access token dckr_pat_...
EMAIL_USERNAME SMTP email username [email protected]
EMAIL_PASSWORD SMTP email app password app-specific-password

Project Structure

House-Price-Prediction-Model/
├── .github/
│   └── workflows/
│       ├── code-quality.yml
│       ├── testing.yml
│       └── deploy.yml
├── tests/
│   ├── __init__.py
│   └── test_app.py
├── app.py                 # Flask application
├── House_dataset.csv      # Training dataset
├── requirements.txt       # Python dependencies
├── Dockerfile            # Container configuration
├── .dockerignore         # Docker ignore rules
├── Readme.md            # Project documentation
└── LICENSE              # License file

Video Demonstration

Watch the video demonstration of the project:

House Price Prediction Demo

Click the link to watch demo.

Contact

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

This project aims to predict house prices using a machine learning model. The project involves data cleaning, feature engineering, model selection, training, and evaluation. The dataset is uploaded by the user, and the model is trained to predict house prices based on various features.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published