llama-swap fork

This is a fork of github/mostlygeek/llama-swap for the project github/LM4eu/goinfer.

background

Back in 2023, Goinfer was an early local LLM proxy swapping models and supporting Ollama, Llamacpp, and KoboldCpp. To simplify the maintenance, we decided in August 2025 to replace our process management with another well-maintained project.

As we do not use Ollama / KoboldCpp any more, we integrated llama-swap into Goinfer to handle communication with llama-server.

issues with github/mostlygeek/llama-swap

The command go get github.com/mostlygeek/llama-swap@v123 fails because the version numbering v123 does not conform the Go standard v1.2.3. The workaround is to use go get github.com/mostlygeek/llama-swap@main that sets v0.0.0-20250925224418-bab7d1f3968a in go.mod.
Importing llama-swap using this workaround is not enough because the compilation requires the folder proxy/ui_dist that does not exist within the source code.
```
//go:embed ui_dist
var reactStaticFS embed.FS
```
The second workaround is to clone llama-swap and use a go.work file.
At LM4eu, we want to use the web UI of the underlying inference engine (e.g. llama.cpp). But the current llama-swap always require the model name within the client request (JSON). This is not possible to access a web page.

changes

Use version v0.0.123 compatible with Go expectations.
Add a minimalist proxy/ui_dist/index.html.
When no model is specified in the request, llama-swap defaults to the running model (first found).

roadmap

We will adapt to the upstream project evolutions, while minimizing our patches.

But we may need to add more patches that may pollute the upstream project.

So we prefer to see if the project github/LM4eu/goinfer is successful. In that case we will discuss how to integrate our changes into the upstream project.

merci

Special thanks to Benson Wong for maintaining llama-swap with clean and well-documented code.

Compared to some alternatives, we enjoy the readable source code of llama-swap. We also appreciate its author, Benson Wong, regularly improves the code.

Name		Name	Last commit message	Last commit date
Latest commit History 366 Commits
.github		.github
ai-plans		ai-plans
docker		docker
event		event
examples		examples
misc		misc
models		models
proxy		proxy
scripts		scripts
ui		ui
.coderabbit.yaml		.coderabbit.yaml
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
CLAUDE.md		CLAUDE.md
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
config.example.yaml		config.example.yaml
go.mod		go.mod
go.sum		go.sum
header.jpeg		header.jpeg
header2.png		header2.png
llama-swap.go		llama-swap.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llama-swap fork

background

issues with github/mostlygeek/llama-swap

changes

roadmap

merci

About

Uh oh!

Languages

License

LM4eu/llama-swap

Folders and files

Latest commit

History

Repository files navigation

llama-swap fork

background

issues with github/mostlygeek/llama-swap

changes

roadmap

merci

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages