Thanks to visit codestin.com
Credit goes to github.com

Skip to content

πŸ§ͺ Help wanted: testing on macOS (Apple Silicon/Intel) & ARM64 Linux (Raspberry Pi)Β #140

@primoco

Description

@primoco

These builds compile in CI but are untested β€” the maintainer doesn't own a Mac or an ARM64 board. If you do and you run local LLMs, your help is hugely appreciated.

Affected binaries (from the latest release)

Status Binary
πŸ§ͺ eullm-macos-arm64 (Metal β€” M1/M2/M3/M4)
πŸ§ͺ eullm-macos-arm64-turboquant-exp (Metal + TurboQuant)
πŸ§ͺ eullm-macos-x64 (Intel Mac)
πŸ§ͺ eullm-linux-arm64 (Raspberry Pi 4/5, Orange Pi 5+, Rock 5B, Jetson, …)

How to help (5–10 minutes)

  1. Download your matching binary from the latest release.
  2. Check the version:
    chmod +x eullm-<your-binary>
    ./eullm-<your-binary> -V
    Expected: something like eullm 0.5.2 (Metal) or eullm 0.5.2 (CPU).
  3. Grab any small GGUF model to keep the test fast (Qwen3-0.5B-Q4_K_M, Phi-3-Mini, Gemma-2B, …) and run:
    ./eullm-<your-binary> run /path/to/model.gguf --ctx-size 4096
    The chat UI auto-starts on http://localhost:11435/.
  4. Report below with:
    • OS + version (e.g. macOS 15.2, Raspberry Pi OS Bookworm 64-bit)
    • Chip / SoC (e.g. M3 Pro, Apple M1, Pi 5 8GB, Pi 4 4GB, RK3588)
    • Output of eullm -V (the exact version line)
    • Tokens/sec you observe (the engine prints metrics after each turn)
    • What worked (model load? generation? chat UI? /api/chat? /v1/chat/completions?)
    • What broke (full error text + the last ~20 lines of engine stdout/stderr)
    • (optional) Did --cache-type-k q4_0 --cache-type-v q4_0 work too?

Priority order (most useful reports first)

  1. macOS Apple Silicon (Metal backend) β€” biggest user base, Metal acceleration is the highest-impact thing to validate. M1/M2/M3/M4 all welcome.
  2. Linux ARM64 (Raspberry Pi 5 or 4 with β‰₯ 4 GB) β€” strategic for the "EuLLM on €50 hardware" sovereign-AI narrative. Pi 5 preferred; Orange Pi 5+ / Rock 5B / Jetson Nano also great.
  3. macOS Intel (x86_64) β€” smaller user base but still part of the catalog.
  4. TurboQuant variants (*-turboquant-exp) β€” secondary; only after the standard variant on the same hardware is confirmed working.

Why we ship untested binaries

Building in CI for these platforms is essentially free and gives users on those systems the option to try. Marking them as "Experimental β€” untested" is intellectual honesty: the code compiles, but the maintainer can't claim it actually runs correctly on hardware they don't have. Your reports turn πŸ§ͺ into βœ….

What you get

The first 3 verified reports per platform (one report per chip/SoC family) get a permanent shout-out in the release notes β€” and become part of the project's tested-platforms badge in the README.

Useful commands when reporting

# Version + variant (CPU / CUDA / Metal / ROCm / Vulkan + TurboQuant)
./eullm -V

# What's loaded
./eullm list

# A quick API round-trip (Ollama-compat)
curl -X POST http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{"model":"<your-model-name-from-list>","messages":[{"role":"user","content":"Hello!"}],"stream":false}'

# A quick API round-trip (OpenAI-compat)
curl -X POST http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"<your-model-name-from-list>","messages":[{"role":"user","content":"Hi!"}]}'

# The embedded chat UI (browser)
open http://localhost:11435/    # macOS
xdg-open http://localhost:11435/ # Linux

Thanks πŸ™ β€” every report helps a real European SME or hobbyist run sovereign AI on hardware they already own.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions