Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Xuanwo/crabml

 
 

Repository files navigation

crabml

crabml is an ongoing experiment that aims to reimplement GGML using Rust.

Currently it can inference a 3B Q8_0 quantized Llama model at a dog slow speed.

Its design goals are:

  • focus on inference only.
  • limit tensor operators to the bare minimum required for LLM inference.
  • fast enough inferencing on cheap hardwares.
  • mmap() from day one.
  • prioritize SIMD ahead of GPU.

Build

RUSTFLAGS="-C target-feature=+neon" cargo build --release
./target/release/crabml-cli -m ./testdata/open-llama-3b-q8_0.gguf "captain america" --steps 100 -t 0.8 -p 1.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 95.7%
  • WGSL 4.3%