⚡ GPU Code Optimizer AI

An AI-powered code analysis tool that uses NVIDIA Nemotron models via Ollama to identify GPU optimization opportunities in your code. Built for the NVIDIA GTC 2026 Golden Ticket Developer Contest.

🎯 What It Does

GPU Code Optimizer AI analyzes Python, CUDA, and C++ code to identify performance bottlenecks and suggest specific optimizations:

Memory Coalescing - Detect uncoalesced memory accesses
Kernel Fusion - Identify opportunities to combine kernels
Shared Memory - Suggest shared memory optimizations
Tensor Core Utilization - Recommend tensor core operations
GPU Occupancy - Analyze thread block configurations
Memory Bandwidth - Optimize data transfer patterns
Compute Intensity - Improve arithmetic intensity
Parallelism - Find parallelization opportunities

🚀 Quick Start

Prerequisites

Install Ollama: https://ollama.ai
Install NVIDIA Nemotron model:
```
ollama pull nemotron
```
Python 3.8+

Installation

Clone this repository:

git clone https://github.com/HUM4NITY/gpu-code-optimizer-ai.git
cd gpu-code-optimizer-ai

Install Python dependencies:
```
pip install -r requirements.txt
```

Run the Application

Option 1: Simple Setup (Static Frontend)

python app.py

Then open your browser to: http://localhost:8000

Option 2: Modern UI (Next.js Frontend - Recommended!)

# Terminal 1 - Start Backend
python app.py

# Terminal 2 - Start Frontend
cd frontend
npm install
npm run dev

Then open your browser to: http://localhost:3000

The Next.js version includes:

✨ Polished UI with animations
📋 Built-in code examples
🎨 Modern design with shadcn/ui
⚡ Better performance

📖 Usage

Paste Your Code - Enter Python, CUDA, or C++ GPU code
Select Model - Choose NVIDIA Nemotron or other available models
Analyze - Click "Analyze & Optimize"
Get Results - Receive detailed optimization suggestions with:
- Severity levels (Critical, High, Medium, Low)
- Specific issues and solutions
- Code examples
- Estimated speedup predictions

Example Analysis

import torch

def inefficient_batch_process(data):
    results = []
    for item in data:
        item_gpu = item.cuda()
        result = model(item_gpu)
        results.append(result.cpu())
    return results

The AI will identify:

❌ CPU-GPU transfer overhead
❌ No batch processing
❌ Synchronous execution
✅ Suggested optimization: Batch processing with single GPU transfer

🏗️ Architecture

gpu-code-optimizer/
├── app.py                 # FastAPI backend
├── frontend/              # Next.js modern UI (recommended)
│   ├── app/              # Next.js pages
│   ├── components/       # React components
│   └── lib/              # API client & types
├── static/               # Simple HTML/CSS/JS UI
│   ├── index_v2.html    # Alternative web interface
│   ├── style_v2.css     # Styling
│   └── script_v2.js     # Frontend logic
├── requirements.txt      # Python dependencies
└── README.md            # Documentation

Tech Stack:

Backend: FastAPI for high-performance async API
AI Engine: Ollama with NVIDIA Nemotron models
Frontend (Modern): Next.js 14, React, TypeScript, Tailwind CSS, shadcn/ui, Framer Motion
Frontend (Simple): Vanilla JavaScript (no build step!)
Analysis: Custom prompt engineering for GPU optimization

🎮 Features

Real-Time Analysis

Instant feedback on code quality
Interactive web interface
Multiple model support

Comprehensive Detection

Memory access patterns
Kernel optimization opportunities
Thread configuration issues
Data transfer inefficiencies

Actionable Insights

Specific code examples
Estimated performance gains
Severity-based prioritization
Category-based grouping

Developer Friendly

Simple setup (< 5 minutes)
No GPU required to run the analyzer
Works with existing code
Export and share results

🧪 Example Code Samples

The application includes built-in examples:

Inefficient CUDA Kernel - Matrix multiplication with poor memory access
Python GPU Computing - PyTorch code with optimization opportunities
NumPy Vectorization - CPU code that could benefit from GPU acceleration

Click "📋 Load Example" in the UI to try them!

🔧 API Endpoints

POST `/api/analyze`

Analyze code for GPU optimizations

Request:

{
  "code": "your code here",
  "language": "python",
  "model": "nemotron"
}

Response:

{
  "optimizations": [...],
  "summary": "Analysis summary",
  "overall_score": 85,
  "model_used": "nemotron"
}

GET `/api/models`

List available Ollama models

GET `/api/examples`

Get example code snippets

GET `/api/health`

Health check endpoint

🎓 How It Works

Code Submission - User submits code through web interface
Prompt Engineering - System creates a detailed analysis prompt
AI Analysis - Nemotron model analyzes code for GPU patterns
Result Parsing - Structured JSON response with optimizations
Visualization - Results displayed with severity and categories

🏆 NVIDIA GTC Contest Submission

This project is designed for the NVIDIA GTC Golden Ticket Developer Contest:

Targets Multiple Challenges:

✅ Ollama Challenge: Built with Ollama and open models
✅ Bryan Catanzaro: Showcases NVIDIA Nemotron models
✅ Sabrina Koumoin: Demonstrates NVIDIA technology

Why This Project Stands Out:

Practical Value - Solves real GPU optimization challenges
NVIDIA Focus - Directly aligned with GPU performance
Nemotron Showcase - Highlights model capabilities
Open Source - Community-driven development
Production Ready - Clean architecture, good UX

🚀 Future Enhancements

Multi-file project analysis
Integration with VS Code extension
Performance benchmarking tools
CI/CD pipeline integration
Support for more languages (Rust, Julia)
Automatic code refactoring
Historical analysis tracking

📝 License

MIT License - Feel free to use, modify, and distribute!

🤝 Contributing

Contributions welcome! This is an open-source project for the NVIDIA developer community.

📧 Contact

Built by a developer passionate about GPU computing and AI.

For NVIDIA GTC Contest:

Tag: #NVIDIAGTC
Models: NVIDIA Nemotron via Ollama
Category: Developer Tools, GPU Optimization, AI

🙏 Acknowledgments

NVIDIA for Nemotron models and GPU technology
Ollama for making model deployment simple
FastAPI for excellent Python web framework
Open Source Community for inspiration

⚡ Built with NVIDIA Nemotron | Powered by Ollama | For Developers, By Developers

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
frontend		frontend
static		static
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
setup.py		setup.py
start.bat		start.bat
test_api.py		test_api.py

License

HUM4NITY/gpu-code-optimizer-ai

Folders and files

Latest commit

History

Repository files navigation

⚡ GPU Code Optimizer AI

🎯 What It Does

🚀 Quick Start

Prerequisites

Installation

Run the Application

Option 1: Simple Setup (Static Frontend)

Option 2: Modern UI (Next.js Frontend - Recommended!)

📖 Usage

Example Analysis

🏗️ Architecture

🎮 Features

Real-Time Analysis

Comprehensive Detection

Actionable Insights

Developer Friendly

🧪 Example Code Samples

🔧 API Endpoints

POST /api/analyze

GET /api/models

GET /api/examples

GET /api/health

🎓 How It Works

🏆 NVIDIA GTC Contest Submission

Targets Multiple Challenges:

Why This Project Stands Out:

🚀 Future Enhancements

📝 License

🤝 Contributing

📧 Contact

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

POST `/api/analyze`

GET `/api/models`

GET `/api/examples`

GET `/api/health`

Packages