Llava clip not loading to GPU in version 0.2.58 (Downgrading to 0.2.55 works) #1324

FYYHU · 2024-04-03T19:28:18Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I wanted to implement Llava as depicted in the readme I used the provided code and the linked GGUF files. I also installed the module using the cublast flags as mentioned in the documentation.

I expected the clip vision tower to be loaded in cuda and the llm to be loaded in cuda

Current Behavior

On Latest version 0.2.58 of llama-cpp-python. I observe that the clip model forces CPU backend, while the llm part uses CUDA. Downgrading llama-cpp-python to version 0.2.55 fixes this issue.

Environment and Context

OS: Ubuntu 22.04 - X86
CUDA: 11.8
Python: 3.8 (in miniconda)
llama-cpp-python: 0.2.58

eisneim · 2024-04-04T06:38:53Z

same here!
OS: Ubuntu 22.04 - X86
CUDA: 12
Python: 3.10.14
llama-cpp-python: 0.2.59

i'm using: CMAKE_ARGS="-DLLAMA_CUBLAS=on -DLLAVA_BUILD=on" pip install --upgrade --force-reinstall --no-cache-dir llama-cpp-python

but i still get:

clip_model_load: - type  f32:  235 tensors
clip_model_load: - type  f16:  142 tensors
clip_model_load: CLIP using CPU backend
clip_model_load: params backend buffer size =  615.49 MB (377 tensors)
clip_model_load: compute allocated memory: 32.89 MB

abetlen · 2024-05-10T14:20:45Z

@FYYHU @eisneim thanks for reporting this, the flag to enable cuda support changed from GGML_USE_CUBLAS to the more appropriate GGML_USE_CUDA however the CMakelists.txt in this project was still using the old value so it wasn't including it. It's fixed now and should be in the next release.

abetlen added the bug Something isn't working label Apr 4, 2024

abetlen closed this as completed in 7f59856 May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llava clip not loading to GPU in version 0.2.58 (Downgrading to 0.2.55 works) #1324

Llava clip not loading to GPU in version 0.2.58 (Downgrading to 0.2.55 works) #1324

FYYHU commented Apr 3, 2024 •

edited

Loading

eisneim commented Apr 4, 2024

abetlen commented May 10, 2024

Llava clip not loading to GPU in version 0.2.58 (Downgrading to 0.2.55 works) #1324

Llava clip not loading to GPU in version 0.2.58 (Downgrading to 0.2.55 works) #1324

Comments

FYYHU commented Apr 3, 2024 • edited Loading

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

eisneim commented Apr 4, 2024

abetlen commented May 10, 2024

FYYHU commented Apr 3, 2024 •

edited

Loading