WebGPU is already supported in Chrome, Safari, Firefox, iOS (v26), and Android (check)
Demo, similar to ChatGPT: https://andreinwald.github.io/browser-llm/
HackerNews discussion: https://news.ycombinator.com/item?id=44767775
- No need to use your OPENAI_API_KEY - it's a local model that runs on your device
- No network requests to any API
- No need to install any program
- No need to download files to your device (model is cached in browser via CacheStorage)
- Site will ask before downloading large files (LLM model) to browser cache
- Hosted on GitHub Pages from this repo - secure, because you can see what you're running
- Default model: Llama-3.2-1B