Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: mudler/LocalAI
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v4.2.1
Choose a base ref
...
head repository: mudler/LocalAI
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v4.2.2
Choose a head ref
  • 14 commits
  • 30 files changed
  • 4 contributors

Commits on May 12, 2026

  1. chore(deps): bump node from 25-slim to 26-slim (#9769)

    Bumps node from 25-slim to 26-slim.
    
    ---
    updated-dependencies:
    - dependency-name: node
      dependency-version: 26-slim
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    cdf50fd View commit details
    Browse the repository at this point in the history
  2. chore(deps): bump actions/upload-artifact from 4 to 7 (#9770)

    Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 7.
    - [Release notes](https://github.com/actions/upload-artifact/releases)
    - [Commits](actions/upload-artifact@v4...v7)
    
    ---
    updated-dependencies:
    - dependency-name: actions/upload-artifact
      dependency-version: '7'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    9be5310 View commit details
    Browse the repository at this point in the history
  3. chore(deps): bump actions/download-artifact from 4 to 8 (#9771)

    Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 8.
    - [Release notes](https://github.com/actions/download-artifact/releases)
    - [Commits](actions/download-artifact@v4...v8)
    
    ---
    updated-dependencies:
    - dependency-name: actions/download-artifact
      dependency-version: '8'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    d75173d View commit details
    Browse the repository at this point in the history
  4. chore(deps): bump github.com/anthropics/anthropic-sdk-go from 1.27.0 …

    …to 1.42.0 (#9772)
    
    chore(deps): bump github.com/anthropics/anthropic-sdk-go
    
    Bumps [github.com/anthropics/anthropic-sdk-go](https://github.com/anthropics/anthropic-sdk-go) from 1.27.0 to 1.42.0.
    - [Release notes](https://github.com/anthropics/anthropic-sdk-go/releases)
    - [Changelog](https://github.com/anthropics/anthropic-sdk-go/blob/main/CHANGELOG.md)
    - [Commits](anthropics/anthropic-sdk-go@v1.27.0...v1.42.0)
    
    ---
    updated-dependencies:
    - dependency-name: github.com/anthropics/anthropic-sdk-go
      dependency-version: 1.42.0
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    7aac599 View commit details
    Browse the repository at this point in the history
  5. chore(deps): bump github.com/onsi/gomega from 1.39.1 to 1.40.0 (#9774)

    Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.39.1 to 1.40.0.
    - [Release notes](https://github.com/onsi/gomega/releases)
    - [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
    - [Commits](onsi/gomega@v1.39.1...v1.40.0)
    
    ---
    updated-dependencies:
    - dependency-name: github.com/onsi/gomega
      dependency-version: 1.40.0
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    cd7d163 View commit details
    Browse the repository at this point in the history
  6. chore(deps): update transformers requirement from >=5.0.0 to >=5.8.0 …

    …in /backend/python/transformers (#9775)
    
    chore(deps): update transformers requirement
    
    Updates the requirements on [transformers](https://github.com/huggingface/transformers) to permit the latest version.
    - [Release notes](https://github.com/huggingface/transformers/releases)
    - [Commits](huggingface/transformers@v5.0.0...v5.8.0)
    
    ---
    updated-dependencies:
    - dependency-name: transformers
      dependency-version: 5.8.0
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    abc2a51 View commit details
    Browse the repository at this point in the history
  7. chore(deps): bump github.com/fsnotify/fsnotify from 1.9.0 to 1.10.1 (#…

    …9778)
    
    Bumps [github.com/fsnotify/fsnotify](https://github.com/fsnotify/fsnotify) from 1.9.0 to 1.10.1.
    - [Release notes](https://github.com/fsnotify/fsnotify/releases)
    - [Changelog](https://github.com/fsnotify/fsnotify/blob/main/CHANGELOG.md)
    - [Commits](fsnotify/fsnotify@v1.9.0...v1.10.1)
    
    ---
    updated-dependencies:
    - dependency-name: github.com/fsnotify/fsnotify
      dependency-version: 1.10.1
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    c660143 View commit details
    Browse the repository at this point in the history
  8. chore(deps): update charset-normalizer requirement from >=3.4.0 to >=…

    …3.4.7 in /backend/python/vllm (#9779)
    
    chore(deps): update charset-normalizer requirement
    
    Updates the requirements on [charset-normalizer](https://github.com/jawah/charset_normalizer) to permit the latest version.
    - [Release notes](https://github.com/jawah/charset_normalizer/releases)
    - [Changelog](https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md)
    - [Commits](jawah/charset_normalizer@3.4.0...3.4.7)
    
    ---
    updated-dependencies:
    - dependency-name: charset-normalizer
      dependency-version: 3.4.7
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    61c9b18 View commit details
    Browse the repository at this point in the history
  9. chore(deps): bump github.com/mudler/edgevpn from 0.31.1 to 0.32.2 (#9773

    )
    
    Bumps [github.com/mudler/edgevpn](https://github.com/mudler/edgevpn) from 0.31.1 to 0.32.2.
    - [Release notes](https://github.com/mudler/edgevpn/releases)
    - [Commits](mudler/edgevpn@v0.31.1...v0.32.2)
    
    ---
    updated-dependencies:
    - dependency-name: github.com/mudler/edgevpn
      dependency-version: 0.32.2
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    37991c8 View commit details
    Browse the repository at this point in the history
  10. fix: parse vulkan VRAM from text (#9669)

    * fix: parse vulkan VRAM from text
    
    Assisted-by: opencode:gpt-5.5
    Signed-off-by: Andreas Egli <[email protected]>
    
    * fix: replace string.split with streaming iteration
    
    Assisted-by: Opencode:Gemma4
    Signed-off-by: Andreas Egli <[email protected]>
    
    ---------
    
    Signed-off-by: Andreas Egli <[email protected]>
    eglia authored May 12, 2026
    Configuration menu
    Copy the full SHA
    03815e3 View commit details
    Browse the repository at this point in the history
  11. chore(deps): bump the npm_and_yarn group across 1 directory with 3 up…

    …dates (#9728)
    
    Bumps the npm_and_yarn group with 3 updates in the /core/http/react-ui directory: [fast-uri](https://github.com/fastify/fast-uri), [hono](https://github.com/honojs/hono) and [ip-address](https://github.com/beaugunderson/ip-address).
    
    
    Updates `fast-uri` from 3.1.0 to 3.1.2
    - [Release notes](https://github.com/fastify/fast-uri/releases)
    - [Commits](fastify/fast-uri@v3.1.0...v3.1.2)
    
    Updates `hono` from 4.12.14 to 4.12.18
    - [Release notes](https://github.com/honojs/hono/releases)
    - [Commits](honojs/hono@v4.12.14...v4.12.18)
    
    Updates `ip-address` from 10.1.0 to 10.2.0
    - [Commits](https://github.com/beaugunderson/ip-address/commits)
    
    ---
    updated-dependencies:
    - dependency-name: fast-uri
      dependency-version: 3.1.2
      dependency-type: indirect
      dependency-group: npm_and_yarn
    - dependency-name: hono
      dependency-version: 4.12.18
      dependency-type: indirect
      dependency-group: npm_and_yarn
    - dependency-name: ip-address
      dependency-version: 10.2.0
      dependency-type: indirect
      dependency-group: npm_and_yarn
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 12, 2026
    Configuration menu
    Copy the full SHA
    a689100 View commit details
    Browse the repository at this point in the history
  12. fix(ollama): accept prompt alias on /api/embed for Ollama parity (#…

    …9780)
    
    Ollama's embedding endpoint accepts both `input` and `prompt` as the
    input string value (see ollama/ollama docs/api.md#generate-embeddings).
    LocalAI only accepted `input`, which broke client libraries that send
    the `prompt` form.
    
    Add `Prompt` to OllamaEmbedRequest and have GetInputStrings fall back
    to it when Input is unset. Input still wins when both are provided.
    
    Fixes #9767.
    
    Assisted-by: Claude:claude-opus-4-7 [Claude Code]
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    Co-authored-by: Ettore Di Giacinto <[email protected]>
    localai-bot and mudler authored May 12, 2026
    Configuration menu
    Copy the full SHA
    a57e736 View commit details
    Browse the repository at this point in the history
  13. ci: close GC race + cascade-skip + darwin grpc gaps from v4.2.1 (#9781)

    * ci: close the GC race + cascade-skip + darwin grpc gaps from v4.2.1
    
    v4.2.1's backend.yml run (#25701862853) exposed three independent issues
    on top of the singletons fix shipped in ea00199. Address all three plus
    two related cleanups:
    
    1. quay GC race in backend-merge-jobs-multiarch (12/37 merges failed with
       "manifest not found"). Even after PR #9746 split multi/single-arch
       merges, the multiarch matrix itself takes ~2h to drain at
       max-parallel: 8, and the earliest per-arch digests (push-by-digest,
       no tag) get reaped by quay's GC before the merge runs. The split
       bounded the race for multiarch; it doesn't eliminate it. Anchor each
       per-arch digest immediately to a tag in the internal ci-cache image
       (`keepalive-<run_id><tag-suffix>-<platform-tag>`). Quay won't GC
       tagged manifests. backend_merge.yml deletes the keepalive tags via
       quay REST API after publishing the user-facing manifest list.
       Cleanup is best-effort: if the quay token is not OAuth-scoped the
       merge does NOT fail, the orphan tags just persist.
    
    2. cascade-skip on backend-merge-jobs-singlearch. v4.2.1 had 2 failed
       and 2 cancelled singlearch builds (out of 199); GHA's default
       `needs:` semantics cascade-skipped the entire singlearch merge
       matrix, so zero singleton tags were applied even though 197
       singletons built successfully. Wrap the merge `if:` in
       `!cancelled() && ...` for both multi and single arch in backend.yml
       and backend_pr.yml so partial build failures publish the successful
       tag-suffixes.
    
    3. Darwin llama-cpp grpc-server build fails with `find_package(absl)`
       not found. Same shape as the ccache/blake3/fmt/hiredis/xxhash/zstd
       fix already in `Dependencies`: a brew cache hit restores
       `/opt/homebrew/Cellar/grpc` so `brew install grpc` no-ops, but
       abseil isn't in our Cellar cache list and never gets installed
       alongside, leaving grpc's CMake unable to resolve it. Mirror the
       `brew reinstall ccache` line with `brew reinstall grpc` to
       re-validate grpc's full transitive dep closure on every cache-hit
       run.
    
    4. Move the four heaviest CUDA cpp builds back to bigger-runner. v4.2.1
       wall-clock: -gpu-nvidia-cuda-12-llama-cpp 5h36m,
       -gpu-nvidia-cuda-12-turboquant 6h05m,
       -gpu-nvidia-cuda-13-llama-cpp 5h37m,
       -gpu-nvidia-cuda-13-turboquant 6h05m. The cuda-12 turboquant and
       cuda-13 turboquant entries are over GHA's 6h job timeout. Phase 5.3
       of the free-tier migration (PR #9730) had explicitly flagged this
       batch as 'highest-risk' with a per-entry revert path. All other
       matrix entries (vulkan-llama-cpp ~47m, ROCm hipblas-llama-cpp ~2h,
       intel sycl-f32 ~1h49m) stay on free-tier ubuntu-latest.
    
    Verified locally: all six edited workflow YAMLs parse cleanly. Real
    verification has to come from the next tag release run.
    
    Assisted-by: Claude:claude-opus-4-7
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    * ci: extract keepalive anchor + cleanup into .github/scripts/
    
    The two inline shell blocks from the previous commit are long enough to
    hurt readability of the workflow YAML and benefit from their own files
    with self-contained docs. Move them to .github/scripts/:
    
      anchor-digest-in-cache.sh    backend_build.yml's keepalive anchor
      cleanup-keepalive-tags.sh    backend_merge.yml's best-effort cleanup
    
    Workflow steps reduce to a single `run:` invocation each, with all the
    parameter plumbing handled by env vars on the step. backend_merge.yml
    also gains a sparse `actions/checkout@v6` step (sparse to .github/scripts
    only) so the cleanup script is available on the runner — backend_build
    already checks out for the docker build.
    
    Net workflow diff: -36 lines across the two files. Script logic and
    behavior are byte-identical to the inline version.
    
    Assisted-by: Claude:claude-opus-4-7
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    ---------
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    Co-authored-by: Ettore Di Giacinto <[email protected]>
    localai-bot and mudler authored May 12, 2026
    Configuration menu
    Copy the full SHA
    86a7f6c View commit details
    Browse the repository at this point in the history
  14. feat(llama-cpp): bump to 1ec7ba0c, adapt grpc-server, expose new sp…

    …ec-decoding options (#9765)
    
    * chore(llama.cpp): bump to 1ec7ba0c14f33f17e980daeeda5f35b225d41994
    
    Picks up the upstream `spec : parallel drafting support` change
    (ggml-org/llama.cpp#22838) which reshapes the speculative-decoding API
    and `server_context_impl`.
    
    Adapt the grpc-server wrapper accordingly:
    
      * `common_params_speculative::type` (single enum) became `types`
        (`std::vector<common_speculative_type>`). Update both the
        "default to draft when a draft model is set" branch and the
        `spec_type`/`speculative_type` option parser. The parser now also
        tolerates comma-separated lists, mirroring the upstream
        `common_speculative_types_from_names` semantics.
      * `common_params_speculative_draft::n_ctx` is gone (draft now shares
        the target context size). Keep the `draft_ctx_size` option name for
        backward compatibility and ignore the value rather than failing.
      * `server_context_impl::model` was renamed to `model_tgt`; update the
        two reranker / model-metadata call sites.
    
    Replaces #9763. Builds cleanly under the linux/amd64 cpu-llama-cpp
    target locally.
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    * feat(llama-cpp): expose new speculative-decoding option keys
    
    Upstream `spec : parallel drafting support` (ggml-org/llama.cpp#22838)
    adds the `ngram_mod`, `ngram_map_k`, and `ngram_map_k4v` speculative
    families and beefs up the draft-model knobs. The previous bump only
    adapted the API; this exposes the new fields through the grpc-server
    options dictionary so model configs can drive them.
    
    New `options:` keys (all under `backend: llama-cpp`):
    
    ngram_mod (`ngram_mod` type):
      spec_ngram_mod_n_min / spec_ngram_mod_n_max / spec_ngram_mod_n_match
    
    ngram_map_k (`ngram_map_k` type):
      spec_ngram_map_k_size_n / spec_ngram_map_k_size_m / spec_ngram_map_k_min_hits
    
    ngram_map_k4v (`ngram_map_k4v` type):
      spec_ngram_map_k4v_size_n / spec_ngram_map_k4v_size_m /
      spec_ngram_map_k4v_min_hits
    
    ngram lookup caches (`ngram_cache` type):
      spec_lookup_cache_static / lookup_cache_static
      spec_lookup_cache_dynamic / lookup_cache_dynamic
    
    Draft-model tuning (active when `spec_type` is `draft`):
      draft_cache_type_k / spec_draft_cache_type_k
      draft_cache_type_v / spec_draft_cache_type_v
      draft_threads / spec_draft_threads
      draft_threads_batch / spec_draft_threads_batch
      draft_cpu_moe / spec_draft_cpu_moe          (bool flag)
      draft_n_cpu_moe / spec_draft_n_cpu_moe      (first N MoE layers on CPU)
      draft_override_tensor / spec_draft_override_tensor
        (comma-separated <tensor regex>=<buffer type>; re-implements upstream's
         static parse_tensor_buffer_overrides since it isn't exported)
    
    `spec_type` already accepted comma-separated lists after the previous
    commit, matching upstream's `common_speculative_types_from_names`.
    
    Docs: refresh `docs/content/advanced/model-configuration.md` with
    per-family tables and a note about multi-type chaining.
    
    Builds locally with `make docker-build-llama-cpp` (linux/amd64
    cpu-llama-cpp AVX variant).
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    * fix(turboquant): bridge new llama.cpp spec API to the legacy fork layout
    
    The previous commits in this series adapted backend/cpp/llama-cpp/grpc-server.cpp
    to the post-#22838 (parallel drafting) llama.cpp API. The turboquant build
    reuses the same grpc-server.cpp through backend/cpp/turboquant/Makefile,
    which copies it into turboquant-<flavor>-build/ and runs patch-grpc-server.sh
    on the copy. The fork branched before the API refactor, so it errors out on:
    
      * `ctx_server.impl->model_tgt` (fork still has `model`)
      * `params.speculative.{ngram_mod,ngram_map_k,ngram_map_k4v,ngram_cache}.*`
        (none of these sub-structs exist in the fork)
      * `params.speculative.draft.{cache_type_k/v, cpuparams[, _batch].n_threads,
        tensor_buft_overrides}` (fork uses the pre-#22397 flat layout)
      * `params.speculative.types` vector / `common_speculative_types_from_names`
        (fork has a scalar `type` and only the singular helper)
    
    Approach:
    
    1. backend/cpp/llama-cpp/grpc-server.cpp: introduce a single feature switch
       `LOCALAI_LEGACY_LLAMA_CPP_SPEC`. When defined, the two `speculative.type[s]`
       discriminations (the "default to draft when a draft model is set" branch
       and the `spec_type` / `speculative_type` option parser) fall back to the
       singular scalar form, and the entire new-option block (ngram_mod / map_k
       / map_k4v / ngram_cache / draft.{cache_type_*, cpuparams*,
       tensor_buft_overrides}) is preprocessed out. The macro is *not* defined
       in the source tree — stock llama-cpp builds get the full new API.
    
    2. backend/cpp/turboquant/patch-grpc-server.sh: two new patch steps applied
       to the per-flavor build copy at turboquant-<flavor>-build/grpc-server.cpp:
       - substitute `ctx_server.impl->model_tgt` -> `ctx_server.impl->model`
       - inject `#define LOCALAI_LEGACY_LLAMA_CPP_SPEC 1` before the first
         `#include`, so the guarded blocks above drop out for the fork build.
    
       Both patches are idempotent and follow the existing sed/awk pattern in
       this script (KV cache types, `get_media_marker`, flat speculative
       renames). Stock llama-cpp's `grpc-server.cpp` is never touched.
    
    Drop both legacy patches once the turboquant fork rebases past
    ggml-org/llama.cpp#22397 / #22838.
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    * fix(turboquant): close draft_ctx_size brace inside legacy guard
    
    The previous turboquant fix wrapped the new option-handler blocks in
    `#ifndef LOCALAI_LEGACY_LLAMA_CPP_SPEC ... #endif` but placed the guard
    in the middle of an `else if` chain — the `} else if` openings of the
    new blocks were responsible for closing the previous block's brace.
    With the macro defined the new blocks vanish, draft_ctx_size's `{`
    loses its closer, the for-loop's `}` is consumed instead, and the
    file ends with a stray opening brace — clang reports it as
    `function-definition is not allowed here before '{'` on the next
    top-level `int main(...)` and `expected '}' at end of input`.
    
    Move the chain split inside the draft_ctx_size branch:
    
        } else if (... "draft_ctx_size") {
            // ...
    #ifdef LOCALAI_LEGACY_LLAMA_CPP_SPEC
        }                                  // legacy: chain ends here
    #else
        } else if (... "spec_ngram_mod_n_min") {  // modern: chain continues
            ...
        } else if (... "draft_override_tensor") {
            ...
        }                                  // closes last branch
    #endif
        }                                  // closes for-loop
    
    Brace count is now balanced under both preprocessor branches (verified
    with `tr -cd '{' | wc -c` against the patched and unpatched outputs).
    
    Local `make docker-build-turboquant` builds the linux/amd64 cpu-llama-cpp
    `turboquant-avx` variant cleanly.
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    * fix(ci): forward AMDGPU_TARGETS into Dockerfile.turboquant builder-prebuilt
    
    Dockerfile.turboquant's `builder-prebuilt` stage was missing the
    `ARG AMDGPU_TARGETS` / `ENV AMDGPU_TARGETS=${AMDGPU_TARGETS}` pair that
    `builder-fromsource` already has (and that `Dockerfile.llama-cpp`
    mirrors across both stages). When CI uses the prebuilt base image
    (quay.io/go-skynet/ci-cache:base-grpc-*, the common path) the build-arg
    passed by the workflow never reaches the env inside the compile stage.
    
    backend/cpp/llama-cpp/Makefile:38 (introduced by #9626) errors out on
    hipblas builds when AMDGPU_TARGETS is empty, and the turboquant
    Makefile reuses backend/cpp/llama-cpp via a sibling build dir, so the
    same check fires from turboquant-fallback under BUILD_TYPE=hipblas:
    
      Makefile:38: *** AMDGPU_TARGETS is empty — set it to a comma-separated
      list of gfx targets e.g. gfx1100,gfx1101.  Stop.
      make: *** [Makefile:66: turboquant-fallback] Error 2
    
    The bug is latent on master because the docker layer cache stays warm
    across builds — the compile step rarely re-runs from scratch. The
    llama.cpp bump in this PR invalidates the cache, so the missing env var
    becomes load-bearing and the hipblas turboquant CI job fails.
    
    Mirror the existing pattern from Dockerfile.llama-cpp.
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    
    ---------
    
    Signed-off-by: Ettore Di Giacinto <[email protected]>
    Co-authored-by: Ettore Di Giacinto <[email protected]>
    localai-bot and mudler authored May 12, 2026
    Configuration menu
    Copy the full SHA
    bc4cd3d View commit details
    Browse the repository at this point in the history
Loading