lmgo is a Windows system tray application that provides an easy-to-use interface for running local LLM models using llama.cpp server with ROCm GPU acceleration. It's specifically optimized for systems with AMD RYZEN AI MAX+ 395 / 8060S graphics.
This application only works on:
- Operating System: Windows 11
- Processor: AMD RYZEN AI MAX+ 395
- Graphics: Radeon 8060S
- Architecture: x86_64
The embedded llama-server is compiled specifically for ROCm GFX1151 architecture and will not work on other hardware configurations.
- System Tray Interface: Runs in the Windows system tray for easy access
- Automatic Model Discovery: Scans directories for .gguf model files
- Single Model Support: Load and run one model at a time
- Web Interface: Built-in web interface for each loaded model
- Auto-start on Boot: Option to start automatically with Windows
- Notifications: Windows toast notifications for model status
- Model-specific Configuration: Custom arguments for different models
- Automatic Web Browser Launch: Option to automatically open web interface when models load
- Download the executable:
lmgo.exeis a standalone executable - Create a models directory: Create a
modelsfolder in the same directory aslmgo.exe - Place your models: Copy your .gguf model files to the
modelsdirectory
- Run lmgo.exe: Double-click the executable
- Configuration: On first run, a default
lmgo.jsonconfiguration file will be created - System Tray: The application will appear in your system tray (notification area)
- Right-click the tray icon to access the menu
- Load Model: Select "Load Model" → choose a model from the list
- Access Web Interface: Once loaded, select "Web Interface" to open the model's web UI
- Unload Model: Select "Unload Current Model" to stop the currently loaded model
The application creates a lmgo.json configuration file with the following structure:
{
"modelDir": "./models",
"autoOpenWebEnabled": true,
"notifications": true,
"basePort": 8080,
"autoLoadModels": [],
"defaultArgs": [
"--prio-batch", "3",
"--no-host",
"--ctx-size", "131072",
"--batch-size", "4096",
"--ubatch-size", "4096",
"--threads", "0",
"--threads-batch", "0",
"-ngl", "999",
"--flash-attn", "on",
"--cache-type-k", "f16",
"--cache-type-v", "f16",
"--kv-offload",
"--no-repack",
"--direct-io",
"--mlock",
"--split-mode", "layer",
"--main-gpu", "0"
],
"modelSpecificArgs": {}
}- modelDir: Directory containing .gguf model files
- autoOpenWebEnabled: Automatically open browser when model loads
- notifications: Enable Windows toast notifications
- basePort: Port number for the model (default: 8080)
- autoLoadModels: Model name to load automatically on startup (only one model supported)
- defaultArgs: Default arguments passed to llama-server
- modelSpecificArgs: Custom arguments for specific models
- Lists all discovered .gguf files in the models directory
- Shows sharded models as single entries
- If a model is already loaded, it will be unloaded first
- Stops the currently loaded model
- Menu item is enabled only when a model is running
- Opens browser to the loaded model's web UI
- Menu item is enabled only when a model is running
- Toggle for automatic startup with Windows
- Adds/removes registry entry for auto-start
- Stops all running models
- Cleans up temporary files
- Exits the application
- llama-server: Custom compiled version for ROCm GFX1151
- Icon: Embedded favicon.ico for tray and notifications
- Configuration: Default settings optimized for AMD hardware
- Supports both single-file and sharded (.gguf) models
- Single model at a time (loading new model unloads current one)
- Graceful cleanup on exit
- Windows registry for auto-start configuration
- System tray integration via systray library
- Windows toast notifications
- Console window hidden by default
- "No .gguf files found"
- Ensure models are in the correct directory (default:
./models) - Check file extensions are
.gguf
- Ensure models are in the correct directory (default:
- Model fails to load
- Verify model compatibility with llama.cpp
- Check available disk space and memory
- Web interface not accessible
- Check firewall settings
- Verify port is not blocked
- Application doesn't start
- Verify system requirements (Windows 11, AMD RYZEN AI MAX+ 395 / Radeon 8060S)
Download llama_cpp_rocm_gfx1151.tar.gz first and then
go mod tidy
go build -ldflags "-s -w -H windowsgui" -buildvcs=false .favicon.ico: Embedded using//go:embeddefault_config.json: Embedded default configurationllama_cpp_rocm_gfx1151.tar.gz: Embedded llama-server binary