Thanks to visit codestin.com
Credit goes to hub.docker.com

jflachman/llama-cpp-python

By jflachman

Updated over 1 year ago

CUDA & CPU versions of llama-cpp-python with server configuration and multiple model support.

Image
Machine learning & AI
0

188

jflachman/llama-cpp-python repository overview

This is a build of Llama-cpp-pytyhon. The tag indicates the build and the supported processor. For example:

  • llama-cpp-python:v0.2.77-cuda is llama-cpp-python version 0.2.77 built with CUDA support (~5GB image)
  • llama-cpp-python:v0.2.77-cpu is llama-cpp-python version 0.2.77 built with CPU only support (~1.8GB)

The CUDA version can run on CPUs. However, the container is larger, So select the CPU version if you don't have an Nvidia Graphics Card / GPU. The default port is 11434. You can change this by adding -e PORT=8000 or adding the desired port to the server.config file (second example)

Running the CUDA image:

docker run -it -d -p 11434:11434 --gpus=all --cap-add SYS_RESOURCE -e USE_MLOCK=0 -e MODEL=/var/model/<model name> -v <local direcctory on host>:/var/model <image name>

To provide a server config file use the CONFIG_FILE environment variable.

Server Config

Server Configuration using a server.config file. The file name is arbitrary. Put this file in the folder that you mount to the container. (i.e. ).

For more information on parameters for configuring the server see:

Tag summary

Content type

Image

Digest

sha256:07623ccbf

Size

1.6 GB

Last updated

over 1 year ago

Requires Docker Desktop 4.37.1 or later.