Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 32efed7

Browse files
committed
docs: Update README
1 parent d80c5cf commit 32efed7

File tree

1 file changed

+91
-33
lines changed

1 file changed

+91
-33
lines changed

README.md

+91-33
Original file line numberDiff line numberDiff line change
@@ -25,105 +25,162 @@ Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest
2525

2626
## Installation
2727

28-
`llama-cpp-python` can be installed directly from PyPI as a source distribution by running:
28+
Requirements:
29+
30+
- Python 3.8+
31+
- C compiler
32+
- Linux: gcc or clang
33+
- Windows: Visual Studio or MinGW
34+
- MacOS: Xcode
35+
36+
To install the package, run:
2937

3038
```bash
3139
pip install llama-cpp-python
3240
```
3341

34-
This will build `llama.cpp` from source using cmake and your system's c compiler (required) and install the library alongside this python package.
42+
This will also build `llama.cpp` from source and install it alongside this python package.
3543

36-
If you run into issues during installation add the `--verbose` flag to the `pip install` command to see the full cmake build log.
44+
If this fails, add `--verbose` to the `pip install` see the full cmake build log.
3745

38-
### Installation with Specific Hardware Acceleration (BLAS, CUDA, Metal, etc)
46+
### Installation Configuration
3947

40-
The default pip install behaviour is to build `llama.cpp` for CPU only on Linux and Windows and use Metal on MacOS.
48+
`llama.cpp` supports a number of hardware acceleration backends to speed up inference as well as backend specific options. See the [llama.cpp README](https://github.com/ggerganov/llama.cpp#build) for a full list.
4149

42-
`llama.cpp` supports a number of hardware acceleration backends depending including OpenBLAS, cuBLAS, CLBlast, HIPBLAS, and Metal.
43-
See the [llama.cpp README](https://github.com/ggerganov/llama.cpp#build) for a full list of supported backends.
50+
All `llama.cpp` cmake build options can be set via the `CMAKE_ARGS` environment variable or via the `--config-settings / -C` cli flag during installation.
4451

45-
All of these backends are supported by `llama-cpp-python` and can be enabled by setting the `CMAKE_ARGS` environment variable before installing.
46-
47-
On Linux and Mac you set the `CMAKE_ARGS` like this:
52+
<details>
53+
<summary>Environment Variables</summary>
4854

4955
```bash
50-
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
56+
# Linux and Mac
57+
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" \
58+
pip install llama-cpp-python
5159
```
5260

53-
On Windows you can set the `CMAKE_ARGS` like this:
54-
55-
```ps
61+
```powershell
62+
# Windows
5663
$env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
5764
pip install llama-cpp-python
5865
```
66+
</details>
67+
68+
<details>
69+
<summary>CLI / requirements.txt</summary>
70+
71+
They can also be set via `pip install -C / --config-settings` command and saved to a `requirements.txt` file:
72+
73+
```bash
74+
pip install --upgrade pip # ensure pip is up to date
75+
pip install llama-cpp-python \
76+
-C cmake.args="-DLLAMA_BLAS=ON;-DLLAMA_BLAS_VENDOR=OpenBLAS"
77+
```
78+
79+
```txt
80+
# requirements.txt
81+
82+
llama-cpp-python -C cmake.args="-DLLAMA_BLAS=ON;-DLLAMA_BLAS_VENDOR=OpenBLAS"
83+
```
84+
85+
</details>
5986

60-
#### OpenBLAS
6187

62-
To install with OpenBLAS, set the `LLAMA_BLAS and LLAMA_BLAS_VENDOR` environment variables before installing:
88+
### Supported Backends
89+
90+
Below are some common backends, their build commands and any additional environment variables required.
91+
92+
<details>
93+
<summary>OpenBLAS (CPU)</summary>
94+
95+
To install with OpenBLAS, set the `LLAMA_BLAS` and `LLAMA_BLAS_VENDOR` environment variables before installing:
6396

6497
```bash
6598
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
6699
```
100+
</details>
67101

68-
#### cuBLAS
102+
<details>
103+
<summary>cuBLAS (CUDA)</summary>
69104

70105
To install with cuBLAS, set the `LLAMA_CUBLAS=on` environment variable before installing:
71106

72107
```bash
73108
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
74109
```
75110

76-
#### Metal
111+
</details>
112+
113+
<details>
114+
<summary>Metal</summary>
77115

78116
To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing:
79117

80118
```bash
81119
CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
82120
```
83121

84-
#### CLBlast
122+
</details>
123+
<details>
124+
125+
<summary>CLBlast (OpenCL)</summary>
85126

86127
To install with CLBlast, set the `LLAMA_CLBLAST=on` environment variable before installing:
87128

88129
```bash
89130
CMAKE_ARGS="-DLLAMA_CLBLAST=on" pip install llama-cpp-python
90131
```
91132

92-
#### hipBLAS
133+
</details>
134+
135+
<details>
136+
<summary>hipBLAS (ROCm)</summary>
93137

94138
To install with hipBLAS / ROCm support for AMD cards, set the `LLAMA_HIPBLAS=on` environment variable before installing:
95139

96140
```bash
97141
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python
98142
```
99143

100-
#### Vulkan
144+
</details>
145+
146+
<details>
147+
<summary>Vulkan</summary>
101148

102149
To install with Vulkan support, set the `LLAMA_VULKAN=on` environment variable before installing:
103150

104151
```bash
105152
CMAKE_ARGS="-DLLAMA_VULKAN=on" pip install llama-cpp-python
106153
```
107154

108-
#### Kompute
155+
</details>
156+
157+
<details>
158+
<summary>Kompute</summary>
109159

110160
To install with Kompute support, set the `LLAMA_KOMPUTE=on` environment variable before installing:
111161

112162
```bash
113163
CMAKE_ARGS="-DLLAMA_KOMPUTE=on" pip install llama-cpp-python
114164
```
165+
</details>
115166

116-
#### SYCL
167+
<details>
168+
<summary>SYCL</summary>
117169

118170
To install with SYCL support, set the `LLAMA_SYCL=on` environment variable before installing:
119171

120172
```bash
121173
source /opt/intel/oneapi/setvars.sh
122174
CMAKE_ARGS="-DLLAMA_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python
123175
```
176+
</details>
177+
124178

125179
### Windows Notes
126180

181+
<details>
182+
<summary>Error: Can't find 'nmake' or 'CMAKE_C_COMPILER'</summary>
183+
127184
If you run into issues where it complains it can't find `'nmake'` `'?'` or CMAKE_C_COMPILER, you can extract w64devkit as [mentioned in llama.cpp repo](https://github.com/ggerganov/llama.cpp#openblas) and add those manually to CMAKE_ARGS before running `pip` install:
128185

129186
```ps
@@ -132,12 +189,14 @@ $env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on -DCMAKE_C_COMPILER=C:/w64devkit/bin/gcc.e
132189
```
133190

134191
See the above instructions and set `CMAKE_ARGS` to the BLAS backend you want to use.
192+
</details>
135193

136194
### MacOS Notes
137195

138196
Detailed MacOS Metal GPU install documentation is available at [docs/install/macos.md](https://llama-cpp-python.readthedocs.io/en/latest/install/macos/)
139197

140-
#### M1 Mac Performance Issue
198+
<details>
199+
<summary>M1 Mac Performance Issue</summary>
141200

142201
Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:
143202

@@ -147,24 +206,21 @@ bash Miniforge3-MacOSX-arm64.sh
147206
```
148207

149208
Otherwise, while installing it will build the llama.cpp x86 version which will be 10x slower on Apple Silicon (M1) Mac.
209+
</details>
150210

151-
#### M Series Mac Error: `(mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))`
211+
<details>
212+
<summary>M Series Mac Error: `(mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))`</summary>
152213

153214
Try installing with
154215

155216
```bash
156217
CMAKE_ARGS="-DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_APPLE_SILICON_PROCESSOR=arm64 -DLLAMA_METAL=on" pip install --upgrade --verbose --force-reinstall --no-cache-dir llama-cpp-python
157218
```
219+
</details>
158220

159221
### Upgrading and Reinstalling
160222

161-
To upgrade or rebuild `llama-cpp-python` add the following flags to ensure that the package is rebuilt correctly:
162-
163-
```bash
164-
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
165-
```
166-
167-
This will ensure that all source files are re-built with the most recently set `CMAKE_ARGS` flags.
223+
To upgrade and rebuild `llama-cpp-python` add `--upgrade --force-reinstall --no-cache-dir` flags to the `pip install` command to ensure the package is rebuilt from source.
168224

169225
## High-level API
170226

@@ -218,13 +274,15 @@ You can pull `Llama` models from Hugging Face using the `from_pretrained` method
218274
You'll need to install the `huggingface-hub` package to use this feature (`pip install huggingface-hub`).
219275

220276
```python
221-
llama = Llama.from_pretrained(
277+
llm = Llama.from_pretrained(
222278
repo_id="Qwen/Qwen1.5-0.5B-Chat-GGUF",
223279
filename="*q8_0.gguf",
224280
verbose=False
225281
)
226282
```
227283

284+
By default the `from_pretrained` method will download the model to the huggingface cache directory so you can manage installed model files with the `huggingface-cli` tool.
285+
228286
### Chat Completion
229287

230288
The high-level API also provides a simple interface for chat completion.

0 commit comments

Comments
 (0)