An AI-powered code analysis tool that uses NVIDIA Nemotron models via Ollama to identify GPU optimization opportunities in your code. Built for the NVIDIA GTC 2026 Golden Ticket Developer Contest.
GPU Code Optimizer AI analyzes Python, CUDA, and C++ code to identify performance bottlenecks and suggest specific optimizations:
- Memory Coalescing - Detect uncoalesced memory accesses
- Kernel Fusion - Identify opportunities to combine kernels
- Shared Memory - Suggest shared memory optimizations
- Tensor Core Utilization - Recommend tensor core operations
- GPU Occupancy - Analyze thread block configurations
- Memory Bandwidth - Optimize data transfer patterns
- Compute Intensity - Improve arithmetic intensity
- Parallelism - Find parallelization opportunities
- Install Ollama: https://ollama.ai
- Install NVIDIA Nemotron model:
ollama pull nemotron
- Python 3.8+
-
Clone this repository:
git clone https://github.com/HUM4NITY/gpu-code-optimizer-ai.git cd gpu-code-optimizer-ai -
Install Python dependencies:
pip install -r requirements.txt
python app.pyThen open your browser to: http://localhost:8000
# Terminal 1 - Start Backend
python app.py
# Terminal 2 - Start Frontend
cd frontend
npm install
npm run devThen open your browser to: http://localhost:3000
The Next.js version includes:
- โจ Polished UI with animations
- ๐ Built-in code examples
- ๐จ Modern design with shadcn/ui
- โก Better performance
- Paste Your Code - Enter Python, CUDA, or C++ GPU code
- Select Model - Choose NVIDIA Nemotron or other available models
- Analyze - Click "Analyze & Optimize"
- Get Results - Receive detailed optimization suggestions with:
- Severity levels (Critical, High, Medium, Low)
- Specific issues and solutions
- Code examples
- Estimated speedup predictions
import torch
def inefficient_batch_process(data):
results = []
for item in data:
item_gpu = item.cuda()
result = model(item_gpu)
results.append(result.cpu())
return resultsThe AI will identify:
- โ CPU-GPU transfer overhead
- โ No batch processing
- โ Synchronous execution
- โ Suggested optimization: Batch processing with single GPU transfer
gpu-code-optimizer/
โโโ app.py # FastAPI backend
โโโ frontend/ # Next.js modern UI (recommended)
โ โโโ app/ # Next.js pages
โ โโโ components/ # React components
โ โโโ lib/ # API client & types
โโโ static/ # Simple HTML/CSS/JS UI
โ โโโ index_v2.html # Alternative web interface
โ โโโ style_v2.css # Styling
โ โโโ script_v2.js # Frontend logic
โโโ requirements.txt # Python dependencies
โโโ README.md # Documentation
Tech Stack:
- Backend: FastAPI for high-performance async API
- AI Engine: Ollama with NVIDIA Nemotron models
- Frontend (Modern): Next.js 14, React, TypeScript, Tailwind CSS, shadcn/ui, Framer Motion
- Frontend (Simple): Vanilla JavaScript (no build step!)
- Analysis: Custom prompt engineering for GPU optimization
- Instant feedback on code quality
- Interactive web interface
- Multiple model support
- Memory access patterns
- Kernel optimization opportunities
- Thread configuration issues
- Data transfer inefficiencies
- Specific code examples
- Estimated performance gains
- Severity-based prioritization
- Category-based grouping
- Simple setup (< 5 minutes)
- No GPU required to run the analyzer
- Works with existing code
- Export and share results
The application includes built-in examples:
- Inefficient CUDA Kernel - Matrix multiplication with poor memory access
- Python GPU Computing - PyTorch code with optimization opportunities
- NumPy Vectorization - CPU code that could benefit from GPU acceleration
Click "๐ Load Example" in the UI to try them!
Analyze code for GPU optimizations
Request:
{
"code": "your code here",
"language": "python",
"model": "nemotron"
}Response:
{
"optimizations": [...],
"summary": "Analysis summary",
"overall_score": 85,
"model_used": "nemotron"
}List available Ollama models
Get example code snippets
Health check endpoint
- Code Submission - User submits code through web interface
- Prompt Engineering - System creates a detailed analysis prompt
- AI Analysis - Nemotron model analyzes code for GPU patterns
- Result Parsing - Structured JSON response with optimizations
- Visualization - Results displayed with severity and categories
This project is designed for the NVIDIA GTC Golden Ticket Developer Contest:
- โ Ollama Challenge: Built with Ollama and open models
- โ Bryan Catanzaro: Showcases NVIDIA Nemotron models
- โ Sabrina Koumoin: Demonstrates NVIDIA technology
- Practical Value - Solves real GPU optimization challenges
- NVIDIA Focus - Directly aligned with GPU performance
- Nemotron Showcase - Highlights model capabilities
- Open Source - Community-driven development
- Production Ready - Clean architecture, good UX
- Multi-file project analysis
- Integration with VS Code extension
- Performance benchmarking tools
- CI/CD pipeline integration
- Support for more languages (Rust, Julia)
- Automatic code refactoring
- Historical analysis tracking
MIT License - Feel free to use, modify, and distribute!
Contributions welcome! This is an open-source project for the NVIDIA developer community.
Built by a developer passionate about GPU computing and AI.
For NVIDIA GTC Contest:
- Tag: #NVIDIAGTC
- Models: NVIDIA Nemotron via Ollama
- Category: Developer Tools, GPU Optimization, AI
- NVIDIA for Nemotron models and GPU technology
- Ollama for making model deployment simple
- FastAPI for excellent Python web framework
- Open Source Community for inspiration
โก Built with NVIDIA Nemotron | Powered by Ollama | For Developers, By Developers