A minimalist, high-performance implementation of RWKV (Receptance Weighted Key Value) models using Candle - a lightweight framework for Rust.
We support the latest and greatest from the RWKV family:
- ✅ RWKV7 (Goose)
- ✅ RWKV6 (Finch)
- ✅ RWKV5 (Eagle)
Ready to run? Here are the commands to get you started immediately.
Run inference directly from the command line.
# Run RWKV7 (Goose)
cargo run --release --example rwkv -- --which "v7-0b1" --prompt "User: why is the sky blue?\n\nAssistant: "
# Run RWKV6 (Finch)
cargo run --release --example rwkv -- --which "v6-1b6" --prompt "User: Hello, how are you?\n\nAssistant: "Running on a laptop? Use quantization to save memory.
# Run Quantized RWKV7 (Goose)
cargo run --release --example rwkv -- --quantized --which "v7-0b1" --prompt "User: Tell me a joke.\n\nAssistant: "If you prefer managing your own model files (e.g. download .pth from HuggingFace), we provide tools to convert and run them.
First, convert PyTorch weights (.pth) to SafeTensors for efficient loading in Rust.
# Convert Model Weights
cargo run --release --example convert -- --input ./RWKV-x060-World-1B6-v2.1-20240328-ctx4096.pth
# Convert State Files
cargo run --release --example convert -- --input ./rwkv-x060-chn_single_round_qa-1B6-20240516-ctx2048.pth# Run with local converted files
cargo run --release --example rwkv -- \
--which "v6-1b6" \
--weight-files ./RWKV-x060-World-1B6-v2.1-20240328-ctx4096.safetensors \
--state-file ./rwkv-x060-chn_single_round_qa-1B6-20240516-ctx2048.safetensors \
--prompt "Hello world!"Convert .pth files to standardized GGUF format.
# Quantize .pth to .gguf
cargo run --release --example quantize -- --input ./RWKV-x060-World-1B6-v2.1-20240328-ctx4096.pth
# Run with local GGUF file
cargo run --release --example rwkv -- \
--quantized \
--which "v6-1b6" \
--weight-files ./RWKV-x060-World-1B6-v2.1-20240328-ctx4096-q4k.gguf \
--prompt "User: Hello!\n\nAssistant: "Contributions are more than welcome! Feel free to open issues or submit PRs.
Powered by candle