Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Misc. bug: oom ,The process does not exit. #14458

Open
@sunnuyday111

Description

@sunnuyday111

Name and Version

[2025-06-30 17:03:22.202468] I version : v0.0.144 (defe859)
[2025-06-30 17:03:22.202468] I compiler : cc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)
[2025-06-30 17:03:22.202468] I target : x86_64-redhat-linux

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

No response

Command line

Problem description & steps to reproduce

root@178b95ed455e:/# /usr/local/lib/python3.10/dist-packages/istoreai/third_party/bin/llama-box/llama-box --host 0.0.0.0 --embeddings --gpu-layers 2 --parallel 4 --ctx-size 8192 --port 40033 --model /data/depot/model_scope/Qwen/Qwen3-Embedding-8B-GGUF/Qwen3-Embedding-8B-Q4_K_M.gguf --alias qwen3-embedding-8b-gguf --no-mmap --no-warmup
[2025-06-30 17:03:22.202468] I
[2025-06-30 17:03:22.202468] I arguments : /usr/local/lib/python3.10/dist-packages/istoreai/third_party/bin/llama-box/llama-box --host 0.0.0.0 --embeddings --gpu-layers 2 --parallel 4 --ctx-size 8192 --port 40033 --model /data/depot/model_scope/Qwen/Qwen3-Embedding-8B-GGUF/Qwen3-Embedding-8B-Q4_K_M.gguf --alias qwen3-embedding-8b-gguf --no-mmap --no-warmup
[2025-06-30 17:03:22.202468] I version : v0.0.144 (defe859)
[2025-06-30 17:03:22.202468] I compiler : cc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)
[2025-06-30 17:03:22.202468] I target : x86_64-redhat-linux
[2025-06-30 17:03:22.202468] I vendor : llama.cpp bc098c3 (5401), stable-diffusion.cpp 3eb18db (204), concurrentqueue 2f09da7 (295), readerwriterqueue 16b48ae (166)
[2025-06-30 17:03:22.202589] I ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
[2025-06-30 17:03:22.202589] I ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
[2025-06-30 17:03:22.202589] I ggml_cuda_init: found 8 CUDA devices:
[2025-06-30 17:03:22.202591] I Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202592] I Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202593] I Device 2: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202595] I Device 3: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202597] I Device 4: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202599] I Device 5: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202602] I Device 6: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[2025-06-30 17:03:22.202604] I Device 7: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

/home/runner/work/llama-box/llama-box/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:75: [2025-06-30 17:03:23.203420] E CUDA error: out of memory
[2025-06-30 17:03:23.203420] E current device: 4, in function ggml_cuda_set_device at /home/runner/work/llama-box/llama-box/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:88
[2025-06-30 17:03:23.203420] E cudaSetDevice(device)
CUDA error

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions