You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+53-30Lines changed: 53 additions & 30 deletions
Original file line number
Diff line number
Diff line change
@@ -25,74 +25,80 @@ Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest
25
25
26
26
## Installation
27
27
28
-
Install from PyPI (requires a c compiler):
28
+
`llama-cpp-python` can be installed directly from PyPI as a source distribution by running:
29
29
30
30
```bash
31
31
pip install llama-cpp-python
32
32
```
33
33
34
-
The above command will attempt to install the package and build `llama.cpp` from source.
35
-
This is the recommended installation method as it ensures that `llama.cpp` is built with the available optimizations for your system.
34
+
This will build `llama.cpp` from source using cmake and your system's c compiler (required) and install the library alongside this python package.
36
35
37
-
If you have previously installed `llama-cpp-python` through pip and want to upgrade your version or rebuild the package with different compiler options, please add the following flags to ensure that the package is rebuilt correctly:
36
+
If you run into issues during installation add the `--verbose` flag to the `pip install` command to see the full cmake build log.
37
+
38
+
39
+
### Installation with Specific Hardware Acceleration (BLAS, CUDA, Metal, etc)
40
+
41
+
The default pip install behaviour is to build `llama.cpp` for CPU only on Linux and Windows and use Metal on MacOS.
42
+
43
+
`llama.cpp` supports a number of hardware acceleration backends depending including OpenBLAS, cuBLAS, CLBlast, HIPBLAS, and Metal.
44
+
See the [llama.cpp README](https://github.com/ggerganov/llama.cpp#build) for a full list of supported backends.
45
+
46
+
All of these backends are supported by `llama-cpp-python` and can be enabled by setting the `CMAKE_ARGS` environment variable before installing.
47
+
48
+
On Linux and Mac you set the `CMAKE_ARGS` like this:
If you run into issues where it complains it can't find `'nmake'``'?'` or CMAKE_C_COMPILER, you can extract w64devkit as [mentioned in llama.cpp repo](https://github.com/ggerganov/llama.cpp#openblas) and add those manually to CMAKE_ARGS before running `pip` install:
Otherwise, while installing it will build the llama.cpp x86 version which will be 10x slower on Apple Silicon (M1) Mac.
106
119
107
120
Detailed MacOS Metal GPU install documentation is available at [docs/install/macos.md](https://llama-cpp-python.readthedocs.io/en/latest/install/macos/)
108
121
122
+
### Upgrading and Reinstalling
123
+
124
+
To upgrade or rebuild `llama-cpp-python` add the following flags to ensure that the package is rebuilt correctly:
@@ -386,7 +409,7 @@ Using pre-built binaries would require disabling these optimizations or supporti
386
409
That being said there are some pre-built binaries available through the Releases as well as some community provided wheels.
387
410
388
411
In the future, I would like to provide pre-built binaries and wheels for common platforms and I'm happy to accept any useful contributions in this area.
389
-
This is currently being tracked in #741
412
+
This is currently being tracked in [#741](https://github.com/abetlen/llama-cpp-python/issues/741)
390
413
391
414
### How does this compare to other Python bindings of `llama.cpp`?
0 commit comments