Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ gen-ai Public

A full-stack machine learning and generative AI project implementing modern deep learning techniques including CNNs, GANs, Diffusion Models, Energy-Based Models, and Large Language Models (LLMs). The project features a FastAPI backend with Docker containerization and Jupyter notebook support for interactive development.

License

Notifications You must be signed in to change notification settings

hyper07/gen-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Gen AI Training Models and API

A comprehensive Generative AI platform providing training models and RESTful API for:

  • Language Models (GPT-2 fine-tuning, RL-based formatting)
  • Computer Vision (CNN, GAN, Diffusion, Energy-based models)
  • Natural Language Processing (Bigram models, embeddings, text generation)

Quick Start

🍎 Mac Users (Recommended - Native MPS Support)

For Mac users, we recommend using command-line training scripts directly to take advantage of Apple Silicon (M1/M2/M3) GPU acceleration via MPS (Metal Performance Shaders). Docker does not support native MPS acceleration.

  1. Setup Python Environment:

    # Create and activate virtual environment
    python -m venv .venv
    source .venv/bin/activate
    
    # Install dependencies
    cd genai
    pip install -e .
  2. Train Models Using Commands (see training sections below):

    # Example: Train CNN on CIFAR-10 (uses MPS automatically)
    python genai/commands/train_cnn.py --epochs 10 --batch_size 64 --dataset cifar10
    
    # Example: Fine-tune GPT-2 (uses MPS automatically)
    python genai/commands/fine_tune_gpt2.py --epochs 3 --batch_size 8

Note: All training scripts automatically detect and use MPS on Mac, then fall back to CUDA or CPU if unavailable.

🐳 Docker Users (Linux/Windows)

For Linux/Windows users or when you need the API server:

# Start all services
docker-compose up --build -d

# Access the services:
# FastAPI: http://localhost:8888
# FastAPI Docs: http://localhost:8888/docs
# Jupyter: http://localhost:8889

Important Docker Limitations:

  • ⚠️ MPS (Apple Silicon GPU) is NOT supported in Docker - Docker containers cannot access Mac's native GPU
  • βœ… CUDA works fine in Docker on Linux/Windows systems with NVIDIA GPUs
  • ⚠️ CPU training is very slow for LLM and CNN models - not recommended for production training
  • πŸ’‘ Mac users should use native command-line training for best performance

πŸ“¦ Automatic Downloads

Note: This repository does not include large dataset files or pre-trained model checkpoints. All datasets (CIFAR-10, MNIST, etc.) and model weights will be automatically downloaded when you run the training scripts or use the API endpoints. The first run may take longer as datasets are downloaded.

  • Datasets: Automatically downloaded from official sources (PyTorch datasets) on first use
  • Pre-trained Models: Download from HuggingFace or train from scratch using the provided scripts
  • Model Checkpoints: Created automatically during training and saved to genai/checkpoints/

Project Structure

This repository contains the following structure:

gen-ai/
β”œβ”€β”€ .gitignore                          # Git ignore file
β”œβ”€β”€ .python-version                     # Python version specification
β”œβ”€β”€ .venv/                              # Virtual environment directory
β”œβ”€β”€ README.md                           # Project documentation
β”œβ”€β”€ docker-compose.yml                  # Docker Compose configuration for multi-service setup
β”œβ”€β”€ pytest.ini                         # Pytest configuration file
β”œβ”€β”€ start_jupyter.bat                   # Windows Jupyter startup script
β”œβ”€β”€ start_jupyter.py                    # Python Jupyter startup script
β”œβ”€β”€ start_jupyter.sh                    # Unix/Linux Jupyter startup script
β”œβ”€β”€ jupyter/                            # Jupyter Notebook service
β”‚   β”œβ”€β”€ Dockerfile                      # Docker container configuration for Jupyter
β”‚   β”œβ”€β”€ README.md                       # Jupyter service documentation
β”‚   β”œβ”€β”€ requirements.txt                # Python dependencies for Jupyter notebooks
β”‚   └── workspace/                      # Jupyter workspace directory
└── genai/                              # Main application directory
    β”œβ”€β”€ .dockerignore                   # Docker ignore file
    β”œβ”€β”€ .env.example                    # Environment variables template
    β”œβ”€β”€ API_DOCUMENTATION.md            # API documentation
    β”œβ”€β”€ Dockerfile                      # Docker container configuration
    β”œβ”€β”€ README.md                       # Application-specific documentation
    β”œβ”€β”€ bigram_model.py                 # Bigram language model implementation
    β”œβ”€β”€ main.py                         # FastAPI application entry point (router-based)
    β”œβ”€β”€ pyproject.toml                  # Python project configuration and dependencies
    β”œβ”€β”€ start.sh                        # Application startup script
    β”œβ”€β”€ commands/                       # Command-line training scripts
    β”‚   β”œβ”€β”€ train_cnn.py                # Script to train CNN models
    β”‚   β”œβ”€β”€ train_gan.py                # Script to train GAN models
    β”‚   β”œβ”€β”€ train_diffusion.py          # Script to train Diffusion model (CIFAR-10)
    β”‚   β”œβ”€β”€ train_energy.py             # Script to train Energy-Based model (CIFAR-10)
    β”‚   β”œβ”€β”€ train_llm_finetune.py       # Script to fine-tune LLM on custom datasets
    β”‚   β”œβ”€β”€ train_llm_format.py         # Script to post-train LLM with RL for formatting
    β”‚   β”œβ”€β”€ train_simple_lm.py          # Script to train simple GPT-style language model
    β”‚   └── fine_tune_gpt2.py           # Fine-tune GPT-2 on SQuAD QA pairs
    β”œβ”€β”€ checkpoints/                    # Model checkpoints and saved models
    β”‚   β”œβ”€β”€ cnn/                        # CNN model checkpoints
    β”‚   β”œβ”€β”€ gan_{dataset}/              # GAN model checkpoints by dataset
    β”‚   β”œβ”€β”€ diffusion_cifar/            # Diffusion model checkpoints (latest + final)
    β”‚   β”œβ”€β”€ energy_cifar/               # Energy model checkpoints (latest + final)
    β”‚   β”œβ”€β”€ llm_finetuned/              # Fine-tuned LLM checkpoints
    β”‚   β”œβ”€β”€ llm_rl/                     # RL-trained language model checkpoints
    β”‚   └── rl_gpt_model.pth       # RL-fine-tuned GPT model
    β”œβ”€β”€ results/                        # Generated sample images
    β”‚   β”œβ”€β”€ diffusion_cifar/            # Diffusion outputs (grid + individual + originals)
    β”‚   └── energy_cifar/               # Energy outputs (grid + individual + originals)
    β”œβ”€β”€ result/                         # Result folder for output from functions
    β”œβ”€β”€ help_lib/                       # Core helper modules (replaces legacy functions/)
    β”‚   β”œβ”€β”€ __init__.py                 # Package initialization
    β”‚   β”œβ”€β”€ checkpoints.py              # Model checkpoint utilities
    β”‚   β”œβ”€β”€ cifar10_classifier.py       # CIFAR-10 training/prediction orchestration
    β”‚   β”œβ”€β”€ data_loader.py              # Dataset and dataloader utilities (CIFAR-10)
    β”‚   β”œβ”€β”€ embeddings.py               # Text embedding helpers
    β”‚   β”œβ”€β”€ evaluator.py                # Training/evaluation loops
    β”‚   β”œβ”€β”€ generator.py                # Text generation helpers
    β”‚   β”œβ”€β”€ model.py                    # Model factory and optimizer setup
    β”‚   β”œβ”€β”€ neural_networks.py          # Simple/Enhanced/Assignment CNNs and utils
    β”‚   β”œβ”€β”€ probability.py              # Probability utilities
    β”‚   β”œβ”€β”€ text_processing.py          # Text preprocessing helpers
    β”‚   β”œβ”€β”€ trainer.py                  # Generic train/eval history collection
    β”‚   └── utils.py                    # Shared utilities
    β”œβ”€β”€ models/                         # API data models and CNN model factory
    β”‚   β”œβ”€β”€ __init__.py                 # Package initialization
    β”‚   β”œβ”€β”€ requests.py                 # Request data models
    β”‚   β”œβ”€β”€ responses.py                # Response data models
    β”‚   β”œβ”€β”€ cnn_models.py               # Practical CNN architectures and factory
    β”‚   └── energy_diffusion_models.py  # Diffusion UNet + Energy model and trainers
    └── routers/                        # FastAPI routers for organized API endpoints
        β”œβ”€β”€ __init__.py                 # Package initialization
        β”œβ”€β”€ probability.py              # Probability and statistics API endpoints
        β”œβ”€β”€ embedding.py                # Text processing and embedding API endpoints
        β”œβ”€β”€ neural_networks.py          # Neural networks, CIFAR-10 & GAN API endpoints
        └── llm.py                      # Language model training and generation API endpoints


Key Files Description

  • docker-compose.yml: Orchestrates multiple services including the FastAPI application and Jupyter notebook server
  • pytest.ini: Configuration file for pytest testing framework
  • .python-version: Specifies the Python version for the project
  • .venv/: Virtual environment directory for local development
  • start_jupyter.*: Cross-platform scripts for starting Jupyter notebook server
  • tests/: Directory for unit tests (pytest compatible)
  • jupyter/: Jupyter notebook service containing:
    • Dockerfile: Container configuration for Jupyter Lab/Notebook environment
    • requirements.txt: Python dependencies for data science and machine learning
    • workspace/: Mounted workspace directory
    • README.md: Jupyter service specific documentation
  • genai/main.py: Main FastAPI application entry point with router-based architecture
  • genai/bigram_model.py: Implementation of a bigram language model for text generation
  • genai/pyproject.toml: Modern Python project configuration with dependencies and build settings
  • genai/Dockerfile: Container configuration for building the FastAPI application image
  • genai/start.sh: Shell script for starting the application
  • genai/.env.example: Template for environment variables configuration
  • genai/API_DOCUMENTATION.md: Comprehensive API documentation
  • genai/test_routers.py: Test script for router-based API endpoints
  • genai/commands/: Command-line training scripts:
    • train_cnn.py: Script to train CNN models on CIFAR-10
    • train_gan.py: Script to train GAN models for image generation
    • train_diffusion.py: Train a UNet-based diffusion model on CIFAR-10
    • train_energy.py: Train an energy-based model on CIFAR-10
    • train_simple_lm.py: Train a simple GPT-style word-level language model (Transformer)
  • genai/checkpoints/: Model checkpoints and saved models:
    • cnn_{dataset}/: CNN model checkpoints by dataset
    • gan_{dataset}/: GAN model checkpoints by dataset
    • diffusion_cifar/: Diffusion model (latest_checkpoint.pth, diffusion_cifar.pth)
    • energy_cifar/: Energy model (latest_checkpoint.pth, energy_cifar.pth)
  • genai/help_lib/: Core helper modules for NLP and vision:
    • data_loader.py: CIFAR-10 transforms and dataloaders
    • neural_networks.py: Includes SimpleCNN, EnhancedCNN, and custom CNN architectures
    • model.py: Model factory for CNN architectures
    • trainer.py, evaluator.py: Training loops and metrics history
    • embeddings.py, text_processing.py, probability.py, utils.py
  • genai/models/: Data models and CNN factory for API:
    • requests.py: API request data structures
    • responses.py: API response data structures
    • cnn_models.py: Practical CNNs (Simple, Enhanced, Flexible) and factory
    • energy_diffusion_models.py: Diffusion UNet, Energy model, trainers, and dataloaders
    • simple_lm.py: GPT-style word-level language model (Transformer)
    • rl_gpt_model.py: RL environment and policy for GPT formatting
  • genai/routers/: FastAPI routers for organized API endpoints:
    • probability.py: Probability and statistics API endpoints (/probability/*)
    • embedding.py: Text processing and embedding API endpoints (/embedding/*)
    • neural_networks.py: Neural networks, CIFAR-10 & GAN API endpoints (/neural-networks/*)
    • llm.py: Language model training and generation API endpoints (/llm/*)

Training Models

πŸ–ΌοΈ Computer Vision Models

Train CNN on CIFAR-10

# Mac (uses MPS automatically)
python genai/commands/train_cnn.py \
  --model_type simple \
  --dataset cifar10 \
  --epochs 10 \
  --batch_size 64 \
  --lr 0.001

# Specify device manually (optional)
python genai/commands/train_cnn.py --epochs 10 --device mps  # Mac
python genai/commands/train_cnn.py --epochs 10 --device cuda  # Linux/Windows with NVIDIA GPU

Outputs:

  • Checkpoints: genai/checkpoints/cnn_cifar10/
  • Model file: genai/checkpoints/cnn_cifar10/cnn_cifar10.pth

Train GAN on MNIST

python genai/commands/train_gan.py \
  --dataset mnist \
  --epochs 20 \
  --batch_size 64 \
  --device mps  # Auto-detected on Mac

Outputs:

  • Checkpoints: genai/checkpoints/gan_mnist/
  • Generated samples saved during training

Train Diffusion Model (CIFAR-10)

python genai/commands/train_diffusion.py \
  --epochs 50 \
  --batch_size 128 \
  --lr 0.001 \
  --diffusion_steps 200

Outputs:

  • Checkpoints: genai/checkpoints/diffusion_cifar/
  • Samples: genai/results/diffusion_cifar/ (grid: generated_samples.png, individuals in individual/, and originals in original/ + original_128/)

Note: Increase --diffusion_steps (e.g., 500–1000) for higher-quality samples.

Train Energy Model (CIFAR-10)

python genai/commands/train_energy.py \
  --epochs 50 \
  --batch_size 128 \
  --lr 0.0001

Outputs:

  • Checkpoints: genai/checkpoints/energy_cifar/
  • Samples: genai/results/energy_cifar/ (grid + individual + originals)

Device Auto-Detection:

  • Mac: Automatically uses mps (Metal Performance Shaders) for GPU acceleration
  • Linux/Windows: Uses cuda if NVIDIA GPU available, otherwise cpu
  • CPU training is very slow for CNN and LLM models - use GPU when possible

πŸ“ Language Models

Simple Word-level Language Model (Baseline)

Train a tiny GPT-style Transformer LM from scratch on plain text (word-level). Good for quick experiments and understanding training loops.

python genai/commands/train_simple_lm.py \
  --data_file path/to/text.txt \
  --epochs 5 \
  --seq_len 30 \
  --batch_size 64 \
  --vocab_size 8000 \
  --embedding_dim 256 \
  --hidden_dim 512 \
  --num_layers 1 \
  --dropout 0.1 \
  --learning_rate 3e-4 \
  --output_dir genai/checkpoints/simple_lm

Outputs:

  • Checkpoint: genai/checkpoints/simple_lm/simple_lm.pt
  • Vocabulary: genai/checkpoints/simple_lm/vocab.json

Notes:

  • If --data_file is omitted, a small built‑in sample text is used
  • Preprocessing: lowercase, remove punctuation (keep spaces), whitespace tokenization
  • Device auto-detect: prefers Apple mps, then cuda, else cpu
  • CPU training is very slow - use GPU (MPS/CUDA) when available

Quick greedy generation example (Python):

import json, torch
from genai.models.simple_lm import SimpleLanguageModel

ckpt_dir = "genai/checkpoints/simple_lm"
with open(f"{ckpt_dir}/vocab.json", "r", encoding="utf-8") as f:
    vocab = json.load(f)
id_to_token = {i: t for t, i in vocab.items()}

# Use the SAME dims you trained with (defaults shown)
model = SimpleLanguageModel(
    vocab_size=len(vocab),
    embedding_dim=256,
    hidden_dim=512,
    num_layers=1,
    dropout=0.1,
)
model.load_state_dict(torch.load(f"{ckpt_dir}/simple_lm.pt", map_location="cpu"))
model.eval()

def preprocess(text: str):
    import re
    text = re.sub(r"[^a-zA-Z0-9\\s]", " ", text)
    text = re.sub(r"\\s+", " ", text)
    return text.lower().strip().split()

def encode(tokens): return [vocab.get(t, vocab.get("<UNK>", 1)) for t in tokens]
def decode(ids):    return " ".join(id_to_token.get(i, "<UNK>") for i in ids)

def generate(prompt: str, max_new_tokens: int = 30):
    tokens = encode(preprocess(prompt))
    for _ in range(max_new_tokens):
        inp = torch.tensor([tokens], dtype=torch.long)
        with torch.no_grad():
            logits = model(inp)              # (1, T, V)
            next_id = int(logits[0, -1].argmax().item())
        tokens.append(next_id)
    return decode(tokens)

print(generate("in the beginning"))

GPT-2 Fine-Tuning on SQuAD (Question-Answering)

Fine-tune GPT-2 on SQuAD dataset for question-answering tasks. Recommended for Mac users - uses MPS automatically for faster training.

# Mac (uses MPS automatically - recommended)
python genai/commands/fine_tune_gpt2.py \
  --model_name openai-community/gpt2 \
  --dataset_name squad \
  --train_samples 10000 \
  --eval_samples 2000 \
  --epochs 3 \
  --batch_size 8 \
  --output_dir genai/checkpoints/gpt2_squad

Outputs:

  • Checkpoints: genai/checkpoints/gpt2_squad/
  • Best model: genai/checkpoints/gpt2_squad/best/

Performance Notes:

  • ⚑ MPS (Mac): Fast training on Apple Silicon
  • ⚑ CUDA (Linux/Windows): Fast training on NVIDIA GPUs
  • ⚠️ CPU: Very slow - not recommended for LLM training

LLM Fine-Tuning on Custom Dataset

Fine-tune any HuggingFace model on custom datasets:

python genai/commands/train_llm_finetune.py \
  --model_name gpt2 \
  --epochs 3 \
  --batch_size 4 \
  --lr 5e-5 \
  --output_dir genai/checkpoints/llm_finetuned

RL Post-Training With Format Control

Train a model to enforce a specific answer format using reinforcement learning:

python genai/commands/train_llm_format.py \
  --epochs 150 \
  --lr 1e-5 \
  --base_model_path genai/checkpoints/gpt2_squad \
  --target_prefix "That is a great question" \
  --target_suffix "let me know if you have any other questions"

Outputs:

  • Checkpoint: genai/checkpoints/rl_gpt_formatting.pth

🌐 API Endpoints (Docker/Linux/Windows)

For API-based training and inference, use the Docker setup (see Docker section below). Note that:

  • API endpoints work fine with CUDA on Linux/Windows
  • CPU-based API training is very slow for LLM and CNN models
  • Mac users should prefer command-line training for best performance

API endpoints available:

  • POST /llm/gpt2-finetune – GPT-2 QA fine-tuning
  • POST /llm/gpt2-generate – Generate from fine-tuned GPT-2
  • POST /llm/rl-train – RL post-training
  • POST /llm/rl-generate – Generate using RL-formatted policy
  • See genai/API_DOCUMENTATION.md for full API reference

Development Setup

🍎 Mac Setup (Recommended for Training)

  1. Python Environment Setup:

    # Create virtual environment
    python -m venv .venv
    
    # Activate virtual environment
    source .venv/bin/activate
    
    # Install dependencies
    cd genai
    pip install -e .
  2. Train Models (see training sections above):

    # All scripts automatically use MPS on Mac
    python genai/commands/train_cnn.py --epochs 10
    python genai/commands/fine_tune_gpt2.py --epochs 3
  3. Start Jupyter Notebook (optional):

    ./start_jupyter.sh
    # Or: python start_jupyter.py

🐧 Linux/Windows Setup

  1. Python Environment Setup:

    # Create virtual environment
    python -m venv .venv
    
    # Activate virtual environment
    # On Linux:
    source .venv/bin/activate
    # On Windows:
    .venv\Scripts\activate
    
    # Install dependencies
    cd genai
    pip install -e .
  2. Train Models:

    # Uses CUDA automatically if available, otherwise CPU
    python genai/commands/train_cnn.py --epochs 10 --device cuda

Note: CPU training is very slow for LLM and CNN models. Use CUDA (NVIDIA GPU) when available.

Python Version

This project uses Python as specified in .python-version. Make sure you have the correct Python version installed.

Docker Setup

Multi-Service Docker Compose

The project uses Docker Compose to orchestrate multiple services:

  • FastAPI Service (genai-fast-api): API server on port 8888
  • Jupyter Service (genai-jupyter): Notebook server on port 8889
# Start all services
docker-compose up --build -d

# Access the services:
# FastAPI: http://localhost:8888
# FastAPI Docs: http://localhost:8888/docs
# Jupyter: http://localhost:8889

⚠️ Important Docker Limitations

For Mac Users:

  • ❌ MPS (Apple Silicon GPU) is NOT supported in Docker
  • Docker containers cannot access Mac's native GPU acceleration
  • Recommendation: Use native command-line training scripts for best performance on Mac

For Linux/Windows Users:

  • βœ… CUDA works fine in Docker on systems with NVIDIA GPUs
  • ⚠️ CPU training is very slow for LLM and CNN models
  • Use --gpus all flag when running Docker containers for GPU access:
    docker run --gpus all -p 8888:8888 genai-fastapi

Performance Comparison:

  • MPS (Mac native): ⚑ Fast
  • CUDA (Linux/Windows): ⚑ Fast
  • CPU (Docker or no GPU): 🐌 Very slow - not recommended for training
Build the FastAPI Docker Image

You can build the FastAPI Docker image from different locations:

Option 1: Build from the project root directory (recommended):

cd /path/to/gen-ai
docker build -t genai-fastapi -f ./genai/Dockerfile ./genai

Option 2: Build from the genai directory:

cd /path/to/gen-ai/genai
docker build -t genai-fastapi .

Option 3: Build from root with different context:

cd /path/to/gen-ai
docker build -t genai-fastapi --build-arg BUILD_CONTEXT=root -f ./genai/Dockerfile .
Run the FastAPI Container

To run the FastAPI container directly:

docker run -p 8888:8888 genai-fastapi

This will start the FastAPI app on port 8888. You can access the API at:

http://localhost:8888

And view the interactive API documentation at:

http://localhost:8888/docs

Jupyter Docker Setup

Build the Jupyter Docker Image
# From project root
docker build -t genai-jupyter -f ./jupyter/Dockerfile ./jupyter

# Or from jupyter directory
cd jupyter
docker build -t genai-jupyter .
Run the Jupyter Container
docker run -p 8889:8888 -v $(pwd)/jupyter/workspace:/home/jovyan genai-jupyter

This will start the Jupyter server on port 8889. You can access it at:

http://localhost:8889

Stopping the Services

To stop all running containers:

# Using Docker Compose
docker-compose down

# Using Make
make down

For troubleshooting, ensure your pyproject.toml is in the correct path and your Dockerfile matches the build context.


About

A full-stack machine learning and generative AI project implementing modern deep learning techniques including CNNs, GANs, Diffusion Models, Energy-Based Models, and Large Language Models (LLMs). The project features a FastAPI backend with Docker containerization and Jupyter notebook support for interactive development.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages