Llava clip not loading to GPU in version 0.2.58 (Downgrading to 0.2.55 works) #1324
Closed
4 tasks done
Labels
bug
Something isn't working
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I wanted to implement Llava as depicted in the readme I used the provided code and the linked GGUF files. I also installed the module using the cublast flags as mentioned in the documentation.
I expected the clip vision tower to be loaded in cuda and the llm to be loaded in cuda
Current Behavior
On Latest version 0.2.58 of llama-cpp-python. I observe that the clip model forces CPU backend, while the llm part uses CUDA. Downgrading llama-cpp-python to version 0.2.55 fixes this issue.
Environment and Context
OS: Ubuntu 22.04 - X86
CUDA: 11.8
Python: 3.8 (in miniconda)
llama-cpp-python: 0.2.58
The text was updated successfully, but these errors were encountered: