Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 2cc6c9a

Browse files
committed
docs: Update README, add FAQ
1 parent 7f3704b commit 2cc6c9a

File tree

1 file changed

+30
-7
lines changed

1 file changed

+30
-7
lines changed

README.md

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# 🦙 Python Bindings for [`llama.cpp`](https://github.com/ggerganov/llama.cpp)
2+
---
23

34
[![Documentation Status](https://readthedocs.org/projects/llama-cpp-python/badge/?version=latest)](https://llama-cpp-python.readthedocs.io/en/latest/?badge=latest)
45
[![Tests](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml)
@@ -23,7 +24,8 @@ Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest
2324

2425

2526

26-
## Installation from PyPI
27+
## Installation
28+
---
2729

2830
Install from PyPI (requires a c compiler):
2931

@@ -107,6 +109,7 @@ See the above instructions and set `CMAKE_ARGS` to the BLAS backend you want to
107109
Detailed MacOS Metal GPU install documentation is available at [docs/install/macos.md](https://llama-cpp-python.readthedocs.io/en/latest/install/macos/)
108110

109111
## High-level API
112+
---
110113

111114
[API Reference](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#high-level-api)
112115

@@ -269,7 +272,8 @@ llm = Llama(model_path="./models/7B/llama-model.gguf", n_ctx=2048)
269272
```
270273

271274

272-
## Web Server
275+
## OpenAI Compatible Web Server
276+
---
273277

274278
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
275279
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
@@ -302,13 +306,14 @@ python3 -m llama_cpp.server --model models/7B/llama-model.gguf --chat_format cha
302306
That will format the prompt according to how model expects it. You can find the prompt format in the model card.
303307
For possible options, see [llama_cpp/llama_chat_format.py](llama_cpp/llama_chat_format.py) and look for lines starting with "@register_chat_format".
304308

305-
### Web Server Examples
309+
### Web Server Features
306310

307311
- [Local Copilot replacement](https://llama-cpp-python.readthedocs.io/en/latest/server/#code-completion)
308312
- [Function Calling support](https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling)
309313
- [Vision API support](https://llama-cpp-python.readthedocs.io/en/latest/server/#multimodal-models)
310314

311315
## Docker image
316+
---
312317

313318
A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python). To run the server:
314319

@@ -318,6 +323,7 @@ docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/lla
318323
[Docker on termux (requires root)](https://gist.github.com/FreddieOliveira/efe850df7ff3951cb62d74bd770dce27) is currently the only known way to run this on phones, see [termux support issue](https://github.com/abetlen/llama-cpp-python/issues/389)
319324

320325
## Low-level API
326+
---
321327

322328
[API Reference](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#low-level-api)
323329

@@ -344,12 +350,14 @@ Below is a short example demonstrating how to use the low-level API to tokenize
344350
Check out the [examples folder](examples/low_level_api) for more examples of using the low-level API.
345351

346352

347-
# Documentation
353+
## Documentation
354+
---
348355

349356
Documentation is available via [https://llama-cpp-python.readthedocs.io/](https://llama-cpp-python.readthedocs.io/).
350357
If you find any issues with the documentation, please open an issue or submit a PR.
351358

352-
# Development
359+
## Development
360+
---
353361

354362
This package is under active development and I welcome any contributions.
355363

@@ -375,7 +383,21 @@ pip install -e .[all]
375383
make clean
376384
```
377385

378-
# How does this compare to other Python bindings of `llama.cpp`?
386+
## FAQ
387+
---
388+
389+
### Are there pre-built binaries / binary wheels available?
390+
391+
The recommended installation method is to install from source as described above.
392+
The reason for this is that `llama.cpp` is built with compiler optimizations that are specific to your system.
393+
Using pre-built binaries would require disabling these optimizations or supporting a large number of pre-built binaries for each platform.
394+
395+
That being said there are some pre-built binaries available through the Releases as well as some community provided wheels.
396+
397+
In the future, I would like to provide pre-built binaries and wheels for common platforms and I'm happy to accept any useful contributions in this area.
398+
This is currently being tracked in #741
399+
400+
### How does this compare to other Python bindings of `llama.cpp`?
379401

380402
I originally wrote this package for my own use with two goals in mind:
381403

@@ -384,6 +406,7 @@ I originally wrote this package for my own use with two goals in mind:
384406

385407
Any contributions and changes to this package will be made with these goals in mind.
386408

387-
# License
409+
## License
410+
---
388411

389412
This project is licensed under the terms of the MIT license.

0 commit comments

Comments
 (0)