This document is the contract for anyone (human or AI agent) touching LocalAI's admin REST surface, the in-process MCP server that wraps it, or the embedded skill prompts that teach the assistant how to use it. Read this before adding/removing/renaming admin endpoints, MCP tools, or skill recipes.
pkg/mcp/localaitools/ is a public Go package that exposes LocalAI's admin/management surface as an MCP server. It is used in two ways:
- In-process: when an admin opens a chat with
metadata.localai_assistant=true, the chat handler injects the in-memory MCP server (pairednet.Pipe()transport, no HTTP loopback) so the LLM can install models, manage backends and edit configs by chatting. - Standalone: the
local-ai mcp-server --target=…subcommand serves the same MCP server over stdio, talking HTTP to a remote LocalAI instance.
The two modes share all tool definitions and skill prompts. They differ only in their LocalAIClient implementation (inproc/ calls services directly; httpapi/ calls REST).
When you change LocalAI's admin surface, three layers must stay aligned:
- REST endpoint in
core/http/endpoints/localai/*.go. - MCP tool registration in
pkg/mcp/localaitools/tools_*.go, plus a method onLocalAIClient(inclient.go) and implementations in bothinproc/client.goandhttpapi/client.go. - Skill prompt under
pkg/mcp/localaitools/prompts/skills/*.md— the markdown that teaches the LLM how to use the new tool. If the new tool fits an existing recipe, update that recipe; otherwise add a new file.
If you ship a REST endpoint without (2) and (3), conversational admins won't see the feature.
- REST endpoint exists in
core/http/endpoints/localai/*.goand is gated byauth.RequireAdmin()incore/http/routes/localai.go. -
LocalAIClientinterface inpkg/mcp/localaitools/client.gohas a method covering the new operation. - DTOs added/updated in
pkg/mcp/localaitools/dto.go(JSON-tagged; never expose raw service types). -
inproc/client.goimplements the new method by calling the service directly (not via HTTP loopback). -
httpapi/client.goimplements the new method by calling the REST endpoint. - Tool registration added in the appropriate
pkg/mcp/localaitools/tools_*.go. Mutating tools must reference safety rule 1 in the description. - If the tool is mutating, ensure
Options{DisableMutating: true}skips it (mirror the pattern intools_models.go). - Skill prompt added or updated under
pkg/mcp/localaitools/prompts/skills/. The prompt must instruct the LLM when to call the tool, what to ask the user first, and what to do on error. - Tests:
pkg/mcp/localaitools/server_test.goadds the tool name toexpectedFullCatalogandexpectedReadOnlyCatalog(if read-only).- Tool dispatch is added to
TestEachToolDispatchesToClient. pkg/mcp/localaitools/httpapi/client_test.gocovers the new HTTP path.
Sometimes you want to teach the LLM a new pattern that uses existing tools. Drop a markdown file under pkg/mcp/localaitools/prompts/skills/<verb>_<noun>.md. The file is automatically embedded by //go:embed and assembled into the system prompt in lexicographic order. No Go changes needed.
Conventions:
- Filename:
<verb>_<noun>.md(e.g.install_chat_model.md,upgrade_backend.md). - First line:
# Skill: <Title Case description>. - Number the steps. Reference exact tool names in backticks.
- If the skill mutates state, remind the LLM to confirm with the user.
These rules guard against the magic-literal drift that surfaced in the first audit. Do not re-introduce bare strings.
- Tool names always come from the
Tool*constants inpkg/mcp/localaitools/tools.go. Tool registrations, the test catalog (server_test.go'sexpectedFullCatalog/expectedReadOnlyCatalog), and dispatch tables reference the constants. The embedded skill prompts underprompts/keep bare strings — that's the one allowed exception, andTestPromptsContainSafetyAnchorsenforces alignment. - Toggle/pin actions use the
modeladmin.Actiontype (pkg/mcp/localaitoolsandcore/services/modeladmin). UseActionEnable/ActionDisable/ActionPin/ActionUnpin; never bare"enable"/"pin"strings. - Capability tags for
list_installed_modelsuse thelocalaitools.Capabilitytype (capability.go). TheLocalAIClient.ListInstalledModelsinterface takes a typedCapability, and theinprocswitch only accepts canonical values ("embed"/"embedding"are not aliases — onlyCapabilityEmbeddings). - HTTP error checks in
httpapi.Clientuseerrors.Is(err, ErrHTTPNotFound), not substring matches onerr.Error(). The typed*HTTPErrorcarriesStatusCodeandBody; add new sentinel errors as needed rather than re-introducing string matching. - Channel sends to
GalleryService.ModelGalleryChannel/BackendGalleryChannelfrom inproc clients MUST select onctx.Done()so a cancelled chat completion releases the goroutine. Seeinproc.sendModelOp/sendBackendOp. - Disk writes of model config YAML go through
modeladmin.writeFileAtomic(temp file +os.Rename).os.WriteFiletruncates on crash and corrupts the model. - MCP server lifecycle: every initialised holder MUST register
Close()withsignals.RegisterGracefulTerminationHandler. The standalonemcp-serverCLI usessignal.NotifyContextto honour SIGINT/SIGTERM.
pkg/mcp/localaitools/
client.go # LocalAIClient interface + DTO registry
dto.go # JSON-tagged DTOs shared by both client impls
server.go # NewServer(client, opts) — registers tools
tools.go # Tool* name constants (single source of truth)
capability.go # Capability type + constants
tools_models.go # gallery_search, install_model, import_model_uri, ...
tools_backends.go
tools_config.go
tools_system.go
tools_state.go
prompts.go # //go:embed loader + SystemPrompt(opts)
prompts/00_role.md
prompts/10_safety.md # SAFETY RULES — change with care
prompts/20_tools.md # curated tool catalog with one-liners
prompts/skills/*.md
inproc/client.go # in-process LocalAIClient (services-direct)
httpapi/client.go # REST LocalAIClient (for standalone CLI / remote)
core/http/endpoints/mcp/
localai_assistant.go # process-wide holder + LocalToolExecutor
core/cli/mcp_server.go # local-ai mcp-server subcommand
The in-process MCP server runs inside the same LocalAI binary that serves chat. Going over HTTP loopback would (a) require minting a synthetic admin API key for the server to authenticate against itself, (b) double-marshal every tool dispatch, and (c) lose access to in-process channels (e.g. GalleryService.ModelGalleryChannel for streaming install progress). So in-process uses inproc.Client. The standalone stdio CLI talks to a remote LocalAI; HTTP is the only option, so it uses httpapi.Client. Both implement the same LocalAIClient interface, and the parity test in pkg/mcp/localaitools/parity_test.go (when present) keeps their output equivalent.
The user chose KISS. Every mutating tool has a safety rule (prompts/10_safety.md rule 1) that requires the LLM to summarise the action and wait for explicit user confirmation before calling it. There is no plan_*/apply_* two-step in code. If you add a mutating tool, do not add per-tool confirmation logic in Go — instead, list the new tool name in prompts/10_safety.md so the LLM knows it falls under the confirmation rule.
The in-memory MCP server runs only on the head node (where the chat handler runs). inproc.Client wraps services that are already distributed-aware (GalleryService coordinates with workers; ListNodes reads the NATS-populated registry). No NATS routing of MCP tools — the admin surface lives on the head, period.