Private AI chat desktop application with local LLM support.
All inference happens on your machine β no cloud, no data sharing.
- What is this?
- Demo
- Key Features
- Installation & Setup
- How to Start Using
- System Requirements
- Supported Models
- Privacy and Security
- Acknowledgments
- License
Oxide Lab is a native desktop application for running large language models locally. Built with Rust and Tauri v2, it provides a fast, private chat interface without requiring internet connectivity or external API services.
dem1.mp4
dem2.mp4
dem3.mp4
- 100% local inference β your data never leaves your machine
- Multi-architecture support: Llama, Qwen2, Qwen2.5, Qwen3, Qwen3 MoE, Mistral, Mixtral, DeepSeek, Yi, SmolLM2
- GGUF and SafeTensors model formats
- Hardware acceleration: CPU, CUDA (NVIDIA), Metal (Apple Silicon), Intel MKL, Apple Accelerate
- Streaming text generation
- Multi-language UI: English, Russian, Brazilian Portuguese
- Modern interface built with Svelte 5 and Tailwind CSS
- Node.js (for frontend build)
- Rust toolchain (for backend)
- For CUDA: NVIDIA GPU with CUDA toolkit
- For Metal: macOS with Apple Silicon
# Install dependencies
npm install
# Run with CPU backend
npm run tauri:dev:cpu
# Run with CUDA backend (NVIDIA GPU)
npm run tauri:dev:cuda
# Platform-aware development
npm run app:dev# Build with CPU backend
npm run tauri:build:cpu
# Build with CUDA backend
npm run tauri:build:cudanpm run lint # ESLint
npm run lint:fix # ESLint with auto-fix
npm run check # Svelte type checking
npm run format # Prettier formatting
npm run test # Vitest testscargo clippy # Linting
cargo test # Unit tests
cargo audit # Security audit- Build or download the application
- Download a compatible GGUF or SafeTensors model (e.g., from Hugging Face)
- Launch Oxide Lab
- Load your model through the interface
- Start chatting
- Windows, macOS, or Linux
- Minimum 8 GB RAM (16+ GB recommended for larger models)
- For GPU acceleration:
- NVIDIA: CUDA-compatible GPU
- Apple: M1/M2/M3 chip (Metal)
- Intel: CPU with MKL support
Architectures with full support:
- Llama (1, 2, 3, 4), Mistral, Mixtral, DeepSeek, Yi, SmolLM2, CodeLlama
- Qwen2, Qwen2.5, Qwen2 MoE
- Qwen3, Qwen3 MoE
Formats:
- GGUF (quantized models)
- SafeTensors
- All processing happens locally on your device
- No telemetry or data collection
- No internet connection required for inference
- Content Security Policy (CSP) enforced
This project is built on top of excellent open-source work:
- Candle β ML framework for Rust (HuggingFace)
- Tauri β Desktop application framework
- Svelte β Frontend framework
- Tokenizers β Fast tokenization (HuggingFace)
See THIRD_PARTY_LICENSES.md for full dependency attribution.
Apache-2.0 β see LICENSE
Copyright (c) 2025 FerrisMind