Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally

License

Notifications You must be signed in to change notification settings

silentx3-coder/Awesome-local-LLM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 

Repository files navigation

Awesome-local-LLM

A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally

Inference platforms

  • LM Studio - discover, download and run local LLMs
  • lemonade - a local LLM server with GPU and NPU Acceleration

Inference engines

  • ollama - get up and running with LLMs
  • llama.cpp - LLM inference in C/C++
  • ik_llama.cpp - llama.cpp fork with additional SOTA quants and improved performance
  • koboldcpp - run GGUF models easily with a KoboldAI UI
  • vllm - a high-throughput and memory-efficient inference and serving engine for LLMs
  • Nano-vLLM - a lightweight vLLM implementation built from scratch
  • vllm-gfx906 - vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
  • FastFlowLM - run LLMs on AMD Ryzen™ AI NPUs
  • exo - run your own AI cluster at home with everyday devices
  • sglang - a fast serving framework for large language models and vision language models

User Interfaces

  • Open WebUI - User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
  • Page Assist - Use your locally running AI models to assist you in your web browsing

Large Language Models

Explorers, Benchmarks, Leaderboards

Model providers

  • Qwen - powered by Alibaba Cloud
  • Mistral AI - a pioneering French artificial intelligence startup
  • Tencent - a profile of a Chinese multinational technology conglomerate and holding company
  • Unsloth AI - focusing on making AI more accessible to everyone (GGUFs etc.)
  • bartowski - providing GGUF versions of popular LLMs
  • Beijing Academy of Artificial Intelligence - a private non-profit organization engaged in AI research and development
  • Open Thoughts - a team of researchers and engineers curating the best open reasoning datasets

Specific models

  • Qwen3 - a collection of the latest generation Qwen LLMs
  • Qwen3-Coder - a collection of the Qwen's most agentic code models to date
  • Gemma 3 - a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models
  • OpenAI-o3-open - a very first OpenAI's open weight LLM model
  • Mistral-Small-3.2-24B-Instruct-2506 - a versatile model designed to handle a wide range of generative AI tasks, including instruction following, conversational assistance, image understanding, and function calling
  • Magistral-Small-2507 - a Mistral Small 3.1 (2503) with added reasoning capabilities
  • Devstral-Small-2507 - an agentic LLM for software engineering tasks fine-tuned from Mistral-Small-3.1
  • Voxtral-Small-24B-2507 - an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance
  • Mellum-4b-base - an LLM optimized for code-related tasks
  • OlympicCoder-32B - a code model that achieves very strong performance on competitive coding benchmarks such as LiveCodeBench and the 2024 International Olympiad in Informatics
  • NextCoder - a family of code-editing LLMs developed using the Qwen2.5-Coder Instruct variants as base
  • GLM-4.5 - a collection of hybrid reasoning models designed for intelligent agents
  • Hunyuan - a collection of Tencent's open-source efficient LLMs designed for versatile deployment across diverse computational environments
  • Phi-4-mini-instruct - a lightweight open model built upon synthetic data and filtered publicly available websites
  • Granite-3.3-2B-Instruct - an LLM fine-tuned for improved reasoning and instruction-following capabilities
  • Qwen-Image - an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing
  • chatterbox - first production-grade open-source TTS model
  • Jan-nano - a compact 4-billion parameter language model specifically designed and trained for deep research tasks
  • Jan-nano-128k - an enhanced version of Jan-nano features a native 128k context window that enables deeper, more comprehensive research capabilities without the performance degradation typically associated with context extension method
  • HunyuanWorld-1 - an open-source 3D world generation model
  • Arch-Router-1.5B - the fastest LLM router model that aligns to subjective usage preferences

Tools

Coding Agents

  • OpenHands - a platform for software development agents powered by AI
  • cline - autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way
  • aider - AI pair programming in your terminal
  • tabby - an open-source GitHub Copilot alternative, set up your own LLM-powered code completion server
  • continue - create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks
  • void - an open-source Cursor alternative, use AI agents on your codebase, checkpoint and visualize changes, and bring any model or host locally
  • Roo-Code - a whole dev team of AI agents in your code editor
  • goose - an open-source, extensible AI agent that goes beyond code suggestions
  • opencode - a AI coding agent built for the terminal
  • kilocode - open source AI coding assistant for planning, building, and fixing code

Agent Frameworks

  • AutoGPT - a powerful platform that allows you to create, deploy, and manage continuous AI agents that automate complex workflows
  • langchain - build context-aware reasoning applications
  • langflow - a powerful tool for building and deploying AI-powered agents and workflows
  • autogen - a programming framework for agentic AI
  • llama_index - the leading framework for building LLM-powered agents over your data
  • crewAI - a framework for orchestrating role-playing, autonomous AI agents
  • agno - a full-stack framework for building Multi-Agent Systems with memory, knowledge and reasoning
  • SuperAGI - an open-source framework to build, manage and run useful Autonomous AI Agents
  • camel - the first and the best multi-agent framework
  • openai-agents-python - a lightweight, powerful framework for multi-agent workflows
  • ClaraVerse - privacy-first, fully local AI workspace with Ollama LLM chat, tool calling, agent builder, Stable Diffusion, and embedded n8n-style automation
  • ragbits - building blocks for rapid development of GenAI applications

Retrieval-Augmented Generation

  • graphrag - a modular graph-based RAG system
  • LightRAG - simple and fast RAG
  • graphiti - build real-time knowledge graphs for AI Agents
  • vanna - an open-source Python RAG framework for SQL generation and related functionality

Computer Use

  • open-interpreter - a natural language interface for computers
  • OmniParser - a simple screen parsing tool towards pure vision based GUI agent
  • self-operating-computer - a framework to enable multimodal models to operate a computer
  • cua - the Docker Container for Computer-Use AI Agents
  • Agent-S - an open agentic framework that uses computers like a human

Browser Automation

  • puppeteer - a JavaScript API for Chrome and Firefox
  • playwright - a framework for Web Testing and Automation
  • Playwright MCP server - an MCP server that provides browser automation capabilities using Playwright
  • browser-use - make websites accessible for AI agents
  • firecrawl - turn entire websites into LLM-ready markdown or structured data
  • stagehand - the AI Browser Automation Framework

Memory Management

  • mem0 - universal memory layer for AI Agents
  • letta - the stateful agents framework with memory, reasoning, and context management
  • cognee - memory for AI Agents in 5 lines of code

Testing, Evaluation, and Observability

  • langfuse - an open-source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more
  • openllmetry - an open-source observability for your LLM application, based on OpenTelemetry
  • giskard - an open-source evaluation & testing for AI & LLM systems
  • agenta - an open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place

Research

  • Perplexica - an open-source alternative to Perplexity AI, the AI-powered search engine
  • gpt-researcher - an LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations
  • local-deep-researcher - fully local web research and report writing assistant
  • SurfSense - an open-source alternative to NotebookLM / Perplexity / Glean
  • local-deep-research - an AI-powered research assistant for deep, iterative research
  • maestro - an AI-powered research application designed to streamline complex research tasks
  • open-notebook - an open-source implementation of Notebook LM with more flexibility and features

Training and Fine-tuning

  • Kiln - the easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets
  • augmentoolkit - train an open-source LLM on new facts

Miscellaneous

  • presenton - an open-source AI presentation generator and API
  • OmniGen2 - exploration to advanced multimodal generation
  • 4o-ghibli-at-home - a powerful, self-hosted AI photo stylizer built for performance and privacy
  • Observer - local open-source micro-agents that observe, log and react, all while keeping your data private and secure

Hardware

Tutorials

Models

Prompt Engineering

Context Engineering

  • Context-Engineering - a frontier, first-principles handbook inspired by Karpathy and 3Blue1Brown for moving beyond prompt engineering to the wider discipline of context design, orchestration, and optimization
  • Awesome-Context-Engineering - a comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems

Agents

Retrieval-Augmented Generation

  • RAG Techniques - various advanced techniques for Retrieval-Augmented Generation (RAG) systems
  • Controllable RAG Agent - an advanced Retrieval-Augmented Generation (RAG) solution for complex question answering that uses sophisticated graph based algorithm to handle the tasks
  • LangChain RAG Cookbook - a collection of modular RAG techniques, implemented in LangChain + Python

Miscellaneous

Communities

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines on how to get started.

About

A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published