A comprehensive Generative AI platform providing training models and RESTful API for:
- Language Models (GPT-2 fine-tuning, RL-based formatting)
- Computer Vision (CNN, GAN, Diffusion, Energy-based models)
- Natural Language Processing (Bigram models, embeddings, text generation)
For Mac users, we recommend using command-line training scripts directly to take advantage of Apple Silicon (M1/M2/M3) GPU acceleration via MPS (Metal Performance Shaders). Docker does not support native MPS acceleration.
-
Setup Python Environment:
# Create and activate virtual environment python -m venv .venv source .venv/bin/activate # Install dependencies cd genai pip install -e .
-
Train Models Using Commands (see training sections below):
# Example: Train CNN on CIFAR-10 (uses MPS automatically) python genai/commands/train_cnn.py --epochs 10 --batch_size 64 --dataset cifar10 # Example: Fine-tune GPT-2 (uses MPS automatically) python genai/commands/fine_tune_gpt2.py --epochs 3 --batch_size 8
Note: All training scripts automatically detect and use MPS on Mac, then fall back to CUDA or CPU if unavailable.
For Linux/Windows users or when you need the API server:
# Start all services
docker-compose up --build -d
# Access the services:
# FastAPI: http://localhost:8888
# FastAPI Docs: http://localhost:8888/docs
# Jupyter: http://localhost:8889Important Docker Limitations:
β οΈ MPS (Apple Silicon GPU) is NOT supported in Docker - Docker containers cannot access Mac's native GPU- β CUDA works fine in Docker on Linux/Windows systems with NVIDIA GPUs
β οΈ CPU training is very slow for LLM and CNN models - not recommended for production training- π‘ Mac users should use native command-line training for best performance
Note: This repository does not include large dataset files or pre-trained model checkpoints. All datasets (CIFAR-10, MNIST, etc.) and model weights will be automatically downloaded when you run the training scripts or use the API endpoints. The first run may take longer as datasets are downloaded.
- Datasets: Automatically downloaded from official sources (PyTorch datasets) on first use
- Pre-trained Models: Download from HuggingFace or train from scratch using the provided scripts
- Model Checkpoints: Created automatically during training and saved to
genai/checkpoints/
This repository contains the following structure:
gen-ai/
βββ .gitignore # Git ignore file
βββ .python-version # Python version specification
βββ .venv/ # Virtual environment directory
βββ README.md # Project documentation
βββ docker-compose.yml # Docker Compose configuration for multi-service setup
βββ pytest.ini # Pytest configuration file
βββ start_jupyter.bat # Windows Jupyter startup script
βββ start_jupyter.py # Python Jupyter startup script
βββ start_jupyter.sh # Unix/Linux Jupyter startup script
βββ jupyter/ # Jupyter Notebook service
β βββ Dockerfile # Docker container configuration for Jupyter
β βββ README.md # Jupyter service documentation
β βββ requirements.txt # Python dependencies for Jupyter notebooks
β βββ workspace/ # Jupyter workspace directory
βββ genai/ # Main application directory
βββ .dockerignore # Docker ignore file
βββ .env.example # Environment variables template
βββ API_DOCUMENTATION.md # API documentation
βββ Dockerfile # Docker container configuration
βββ README.md # Application-specific documentation
βββ bigram_model.py # Bigram language model implementation
βββ main.py # FastAPI application entry point (router-based)
βββ pyproject.toml # Python project configuration and dependencies
βββ start.sh # Application startup script
βββ commands/ # Command-line training scripts
β βββ train_cnn.py # Script to train CNN models
β βββ train_gan.py # Script to train GAN models
β βββ train_diffusion.py # Script to train Diffusion model (CIFAR-10)
β βββ train_energy.py # Script to train Energy-Based model (CIFAR-10)
β βββ train_llm_finetune.py # Script to fine-tune LLM on custom datasets
β βββ train_llm_format.py # Script to post-train LLM with RL for formatting
β βββ train_simple_lm.py # Script to train simple GPT-style language model
β βββ fine_tune_gpt2.py # Fine-tune GPT-2 on SQuAD QA pairs
βββ checkpoints/ # Model checkpoints and saved models
β βββ cnn/ # CNN model checkpoints
β βββ gan_{dataset}/ # GAN model checkpoints by dataset
β βββ diffusion_cifar/ # Diffusion model checkpoints (latest + final)
β βββ energy_cifar/ # Energy model checkpoints (latest + final)
β βββ llm_finetuned/ # Fine-tuned LLM checkpoints
β βββ llm_rl/ # RL-trained language model checkpoints
β βββ rl_gpt_model.pth # RL-fine-tuned GPT model
βββ results/ # Generated sample images
β βββ diffusion_cifar/ # Diffusion outputs (grid + individual + originals)
β βββ energy_cifar/ # Energy outputs (grid + individual + originals)
βββ result/ # Result folder for output from functions
βββ help_lib/ # Core helper modules (replaces legacy functions/)
β βββ __init__.py # Package initialization
β βββ checkpoints.py # Model checkpoint utilities
β βββ cifar10_classifier.py # CIFAR-10 training/prediction orchestration
β βββ data_loader.py # Dataset and dataloader utilities (CIFAR-10)
β βββ embeddings.py # Text embedding helpers
β βββ evaluator.py # Training/evaluation loops
β βββ generator.py # Text generation helpers
β βββ model.py # Model factory and optimizer setup
β βββ neural_networks.py # Simple/Enhanced/Assignment CNNs and utils
β βββ probability.py # Probability utilities
β βββ text_processing.py # Text preprocessing helpers
β βββ trainer.py # Generic train/eval history collection
β βββ utils.py # Shared utilities
βββ models/ # API data models and CNN model factory
β βββ __init__.py # Package initialization
β βββ requests.py # Request data models
β βββ responses.py # Response data models
β βββ cnn_models.py # Practical CNN architectures and factory
β βββ energy_diffusion_models.py # Diffusion UNet + Energy model and trainers
βββ routers/ # FastAPI routers for organized API endpoints
βββ __init__.py # Package initialization
βββ probability.py # Probability and statistics API endpoints
βββ embedding.py # Text processing and embedding API endpoints
βββ neural_networks.py # Neural networks, CIFAR-10 & GAN API endpoints
βββ llm.py # Language model training and generation API endpoints
docker-compose.yml: Orchestrates multiple services including the FastAPI application and Jupyter notebook serverpytest.ini: Configuration file for pytest testing framework.python-version: Specifies the Python version for the project.venv/: Virtual environment directory for local developmentstart_jupyter.*: Cross-platform scripts for starting Jupyter notebook servertests/: Directory for unit tests (pytest compatible)jupyter/: Jupyter notebook service containing:Dockerfile: Container configuration for Jupyter Lab/Notebook environmentrequirements.txt: Python dependencies for data science and machine learningworkspace/: Mounted workspace directoryREADME.md: Jupyter service specific documentation
genai/main.py: Main FastAPI application entry point with router-based architecturegenai/bigram_model.py: Implementation of a bigram language model for text generationgenai/pyproject.toml: Modern Python project configuration with dependencies and build settingsgenai/Dockerfile: Container configuration for building the FastAPI application imagegenai/start.sh: Shell script for starting the applicationgenai/.env.example: Template for environment variables configurationgenai/API_DOCUMENTATION.md: Comprehensive API documentationgenai/test_routers.py: Test script for router-based API endpointsgenai/commands/: Command-line training scripts:train_cnn.py: Script to train CNN models on CIFAR-10train_gan.py: Script to train GAN models for image generationtrain_diffusion.py: Train a UNet-based diffusion model on CIFAR-10train_energy.py: Train an energy-based model on CIFAR-10train_simple_lm.py: Train a simple GPT-style word-level language model (Transformer)
genai/checkpoints/: Model checkpoints and saved models:cnn_{dataset}/: CNN model checkpoints by datasetgan_{dataset}/: GAN model checkpoints by datasetdiffusion_cifar/: Diffusion model (latest_checkpoint.pth,diffusion_cifar.pth)energy_cifar/: Energy model (latest_checkpoint.pth,energy_cifar.pth)
genai/help_lib/: Core helper modules for NLP and vision:data_loader.py: CIFAR-10 transforms and dataloadersneural_networks.py: IncludesSimpleCNN,EnhancedCNN, and custom CNN architecturesmodel.py: Model factory for CNN architecturestrainer.py,evaluator.py: Training loops and metrics historyembeddings.py,text_processing.py,probability.py,utils.py
genai/models/: Data models and CNN factory for API:requests.py: API request data structuresresponses.py: API response data structurescnn_models.py: Practical CNNs (Simple, Enhanced, Flexible) and factoryenergy_diffusion_models.py: Diffusion UNet, Energy model, trainers, and dataloaderssimple_lm.py: GPT-style word-level language model (Transformer)rl_gpt_model.py: RL environment and policy for GPT formatting
genai/routers/: FastAPI routers for organized API endpoints:probability.py: Probability and statistics API endpoints (/probability/*)embedding.py: Text processing and embedding API endpoints (/embedding/*)neural_networks.py: Neural networks, CIFAR-10 & GAN API endpoints (/neural-networks/*)llm.py: Language model training and generation API endpoints (/llm/*)
# Mac (uses MPS automatically)
python genai/commands/train_cnn.py \
--model_type simple \
--dataset cifar10 \
--epochs 10 \
--batch_size 64 \
--lr 0.001
# Specify device manually (optional)
python genai/commands/train_cnn.py --epochs 10 --device mps # Mac
python genai/commands/train_cnn.py --epochs 10 --device cuda # Linux/Windows with NVIDIA GPUOutputs:
- Checkpoints:
genai/checkpoints/cnn_cifar10/ - Model file:
genai/checkpoints/cnn_cifar10/cnn_cifar10.pth
python genai/commands/train_gan.py \
--dataset mnist \
--epochs 20 \
--batch_size 64 \
--device mps # Auto-detected on MacOutputs:
- Checkpoints:
genai/checkpoints/gan_mnist/ - Generated samples saved during training
python genai/commands/train_diffusion.py \
--epochs 50 \
--batch_size 128 \
--lr 0.001 \
--diffusion_steps 200Outputs:
- Checkpoints:
genai/checkpoints/diffusion_cifar/ - Samples:
genai/results/diffusion_cifar/(grid:generated_samples.png, individuals inindividual/, and originals inoriginal/+original_128/)
Note: Increase --diffusion_steps (e.g., 500β1000) for higher-quality samples.
python genai/commands/train_energy.py \
--epochs 50 \
--batch_size 128 \
--lr 0.0001Outputs:
- Checkpoints:
genai/checkpoints/energy_cifar/ - Samples:
genai/results/energy_cifar/(grid + individual + originals)
Device Auto-Detection:
- Mac: Automatically uses
mps(Metal Performance Shaders) for GPU acceleration - Linux/Windows: Uses
cudaif NVIDIA GPU available, otherwisecpu - CPU training is very slow for CNN and LLM models - use GPU when possible
Train a tiny GPT-style Transformer LM from scratch on plain text (word-level). Good for quick experiments and understanding training loops.
python genai/commands/train_simple_lm.py \
--data_file path/to/text.txt \
--epochs 5 \
--seq_len 30 \
--batch_size 64 \
--vocab_size 8000 \
--embedding_dim 256 \
--hidden_dim 512 \
--num_layers 1 \
--dropout 0.1 \
--learning_rate 3e-4 \
--output_dir genai/checkpoints/simple_lmOutputs:
- Checkpoint:
genai/checkpoints/simple_lm/simple_lm.pt - Vocabulary:
genai/checkpoints/simple_lm/vocab.json
Notes:
- If
--data_fileis omitted, a small builtβin sample text is used - Preprocessing: lowercase, remove punctuation (keep spaces), whitespace tokenization
- Device auto-detect: prefers Apple
mps, thencuda, elsecpu - CPU training is very slow - use GPU (MPS/CUDA) when available
Quick greedy generation example (Python):
import json, torch
from genai.models.simple_lm import SimpleLanguageModel
ckpt_dir = "genai/checkpoints/simple_lm"
with open(f"{ckpt_dir}/vocab.json", "r", encoding="utf-8") as f:
vocab = json.load(f)
id_to_token = {i: t for t, i in vocab.items()}
# Use the SAME dims you trained with (defaults shown)
model = SimpleLanguageModel(
vocab_size=len(vocab),
embedding_dim=256,
hidden_dim=512,
num_layers=1,
dropout=0.1,
)
model.load_state_dict(torch.load(f"{ckpt_dir}/simple_lm.pt", map_location="cpu"))
model.eval()
def preprocess(text: str):
import re
text = re.sub(r"[^a-zA-Z0-9\\s]", " ", text)
text = re.sub(r"\\s+", " ", text)
return text.lower().strip().split()
def encode(tokens): return [vocab.get(t, vocab.get("<UNK>", 1)) for t in tokens]
def decode(ids): return " ".join(id_to_token.get(i, "<UNK>") for i in ids)
def generate(prompt: str, max_new_tokens: int = 30):
tokens = encode(preprocess(prompt))
for _ in range(max_new_tokens):
inp = torch.tensor([tokens], dtype=torch.long)
with torch.no_grad():
logits = model(inp) # (1, T, V)
next_id = int(logits[0, -1].argmax().item())
tokens.append(next_id)
return decode(tokens)
print(generate("in the beginning"))Fine-tune GPT-2 on SQuAD dataset for question-answering tasks. Recommended for Mac users - uses MPS automatically for faster training.
# Mac (uses MPS automatically - recommended)
python genai/commands/fine_tune_gpt2.py \
--model_name openai-community/gpt2 \
--dataset_name squad \
--train_samples 10000 \
--eval_samples 2000 \
--epochs 3 \
--batch_size 8 \
--output_dir genai/checkpoints/gpt2_squadOutputs:
- Checkpoints:
genai/checkpoints/gpt2_squad/ - Best model:
genai/checkpoints/gpt2_squad/best/
Performance Notes:
- β‘ MPS (Mac): Fast training on Apple Silicon
- β‘ CUDA (Linux/Windows): Fast training on NVIDIA GPUs
β οΈ CPU: Very slow - not recommended for LLM training
Fine-tune any HuggingFace model on custom datasets:
python genai/commands/train_llm_finetune.py \
--model_name gpt2 \
--epochs 3 \
--batch_size 4 \
--lr 5e-5 \
--output_dir genai/checkpoints/llm_finetunedTrain a model to enforce a specific answer format using reinforcement learning:
python genai/commands/train_llm_format.py \
--epochs 150 \
--lr 1e-5 \
--base_model_path genai/checkpoints/gpt2_squad \
--target_prefix "That is a great question" \
--target_suffix "let me know if you have any other questions"Outputs:
- Checkpoint:
genai/checkpoints/rl_gpt_formatting.pth
For API-based training and inference, use the Docker setup (see Docker section below). Note that:
- API endpoints work fine with CUDA on Linux/Windows
- CPU-based API training is very slow for LLM and CNN models
- Mac users should prefer command-line training for best performance
API endpoints available:
POST /llm/gpt2-finetuneβ GPT-2 QA fine-tuningPOST /llm/gpt2-generateβ Generate from fine-tuned GPT-2POST /llm/rl-trainβ RL post-trainingPOST /llm/rl-generateβ Generate using RL-formatted policy- See
genai/API_DOCUMENTATION.mdfor full API reference
-
Python Environment Setup:
# Create virtual environment python -m venv .venv # Activate virtual environment source .venv/bin/activate # Install dependencies cd genai pip install -e .
-
Train Models (see training sections above):
# All scripts automatically use MPS on Mac python genai/commands/train_cnn.py --epochs 10 python genai/commands/fine_tune_gpt2.py --epochs 3 -
Start Jupyter Notebook (optional):
./start_jupyter.sh # Or: python start_jupyter.py
-
Python Environment Setup:
# Create virtual environment python -m venv .venv # Activate virtual environment # On Linux: source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install dependencies cd genai pip install -e .
-
Train Models:
# Uses CUDA automatically if available, otherwise CPU python genai/commands/train_cnn.py --epochs 10 --device cuda
Note: CPU training is very slow for LLM and CNN models. Use CUDA (NVIDIA GPU) when available.
This project uses Python as specified in .python-version. Make sure you have the correct Python version installed.
The project uses Docker Compose to orchestrate multiple services:
- FastAPI Service (
genai-fast-api): API server on port 8888 - Jupyter Service (
genai-jupyter): Notebook server on port 8889
# Start all services
docker-compose up --build -d
# Access the services:
# FastAPI: http://localhost:8888
# FastAPI Docs: http://localhost:8888/docs
# Jupyter: http://localhost:8889For Mac Users:
- β MPS (Apple Silicon GPU) is NOT supported in Docker
- Docker containers cannot access Mac's native GPU acceleration
- Recommendation: Use native command-line training scripts for best performance on Mac
For Linux/Windows Users:
- β CUDA works fine in Docker on systems with NVIDIA GPUs
β οΈ CPU training is very slow for LLM and CNN models- Use
--gpus allflag when running Docker containers for GPU access:docker run --gpus all -p 8888:8888 genai-fastapi
Performance Comparison:
- MPS (Mac native): β‘ Fast
- CUDA (Linux/Windows): β‘ Fast
- CPU (Docker or no GPU): π Very slow - not recommended for training
You can build the FastAPI Docker image from different locations:
Option 1: Build from the project root directory (recommended):
cd /path/to/gen-ai
docker build -t genai-fastapi -f ./genai/Dockerfile ./genaiOption 2: Build from the genai directory:
cd /path/to/gen-ai/genai
docker build -t genai-fastapi .Option 3: Build from root with different context:
cd /path/to/gen-ai
docker build -t genai-fastapi --build-arg BUILD_CONTEXT=root -f ./genai/Dockerfile .To run the FastAPI container directly:
docker run -p 8888:8888 genai-fastapiThis will start the FastAPI app on port 8888. You can access the API at:
http://localhost:8888
And view the interactive API documentation at:
http://localhost:8888/docs
# From project root
docker build -t genai-jupyter -f ./jupyter/Dockerfile ./jupyter
# Or from jupyter directory
cd jupyter
docker build -t genai-jupyter .docker run -p 8889:8888 -v $(pwd)/jupyter/workspace:/home/jovyan genai-jupyterThis will start the Jupyter server on port 8889. You can access it at:
http://localhost:8889
To stop all running containers:
# Using Docker Compose
docker-compose down
# Using Make
make downFor troubleshooting, ensure your pyproject.toml is in the correct path and your Dockerfile matches the build context.