Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit ccbec25

Browse files
committed
update readme
1 parent a1a7474 commit ccbec25

File tree

1 file changed

+5
-35
lines changed

1 file changed

+5
-35
lines changed

README.md

Lines changed: 5 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,6 @@ Inference of Meta's LLaMA model (and others) in pure C/C++.
1616
2.3 [Infilling](#infilling)
1717
3. [Android](#importing-in-android)
1818

19-
> [!NOTE]
20-
> Now with support for Llama 3, Phi-3, and flash attention
21-
2219
## Quick Start
2320

2421
Access this library via Maven:
@@ -27,18 +24,7 @@ Access this library via Maven:
2724
<dependency>
2825
<groupId>de.kherud</groupId>
2926
<artifactId>llama</artifactId>
30-
<version>3.4.1</version>
31-
</dependency>
32-
```
33-
34-
Bu default the default library artifact is built only with CPU inference support. To enable CUDA, use a `cuda12-linux-x86-64` maven classifier:
35-
36-
```xml
37-
<dependency>
38-
<groupId>de.kherud</groupId>
39-
<artifactId>llama</artifactId>
40-
<version>3.4.1</version>
41-
<classifier>cuda12-linux-x86-64</classifier>
27+
<version>4.0.0</version>
4228
</dependency>
4329
```
4430

@@ -50,11 +36,7 @@ We support CPU inference for the following platforms out of the box:
5036

5137
- Linux x86-64, aarch64
5238
- MacOS x86-64, aarch64 (M-series)
53-
- Windows x86-64, x64, arm (32 bit)
54-
55-
For GPU inference, we support:
56-
57-
- Linux x86-64 with CUDA 12.1+
39+
- Windows x86-64, x64
5840

5941
If any of these match your platform, you can include the Maven dependency and get started.
6042

@@ -88,13 +70,9 @@ All compiled libraries will be put in a resources directory matching your platfo
8870

8971
#### Library Location
9072

91-
This project has to load three shared libraries:
73+
This project has to load a single shared library `jllama`.
9274

93-
- ggml
94-
- llama
95-
- jllama
96-
97-
Note, that the file names vary between operating systems, e.g., `ggml.dll` on Windows, `libggml.so` on Linux, and `libggml.dylib` on macOS.
75+
Note, that the file name varies between operating systems, e.g., `jllama.dll` on Windows, `jllama.so` on Linux, and `jllama.dylib` on macOS.
9876

9977
The application will search in the following order in the following locations:
10078

@@ -105,14 +83,6 @@ The application will search in the following order in the following locations:
10583
- From the **JAR**: If any of the libraries weren't found yet, the application will try to use a prebuilt shared library.
10684
This of course only works for the [supported platforms](#no-setup-required) .
10785

108-
Not all libraries have to be in the same location.
109-
For example, if you already have a llama.cpp and ggml version you can install them as a system library and rely on the jllama library from the JAR.
110-
This way, you don't have to compile anything.
111-
112-
#### CUDA
113-
114-
On Linux x86-64 with CUDA 12.1+, the library assumes that your CUDA libraries are findable in `java.library.path`. If you have CUDA installed in a non-standard location, then point the `java.library.path` to the directory containing the `libcudart.so.12` library.
115-
11686
## Documentation
11787

11888
### Example
@@ -234,7 +204,7 @@ LlamaModel.setLogger(null, (level, message) -> {});
234204
## Importing in Android
235205

236206
You can use this library in Android project.
237-
1. Add java-llama.cpp as a submodule in your android `app` project directory
207+
1. Add java-llama.cpp as a submodule in your an droid `app` project directory
238208
```shell
239209
git submodule add https://github.com/kherud/java-llama.cpp
240210
```

0 commit comments

Comments
 (0)