crabml

crabml is an ongoing experiment that aims to reimplement GGML using Rust.

Currently it can inference a 3B Q8_0 quantized Llama model at a dog slow speed.

Its design goals are:

focus on inference only.
limit tensor operators to the bare minimum required for LLM inference.
fast enough inferencing on cheap hardwares.
mmap() from day one.
prioritize SIMD ahead of GPU.

Build

RUSTFLAGS="-C target-feature=+neon" cargo build --release
./target/release/crabml-cli -m ./testdata/open-llama-3b-q8_0.gguf "captain america" --steps 100 -t 0.8 -p 1.0

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
.github/workflows		.github/workflows
crabml-cli		crabml-cli
crabml-core		crabml-core
crabml-llama2		crabml-llama2
testdata		testdata
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

crabml

Build

About

Uh oh!

Releases

Packages

Languages

License

Xuanwo/crabml

Folders and files

Latest commit

History

Repository files navigation

crabml

Build

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages