- Rust toolchain (
rustup,cargo) with edition 2024 support make- Optional for WASM builds:
wasm-bindgen-cli
git clone <your-fork-or-remote> oxidize
cd oxidize
make buildcargo run -p oxidize-cli -- --prompt "hello"cargo run -p oxidize-cli --release -- \
--model /path/to/model.gguf \
--prompt "Your prompt here" \
--max-tokens 512 \
--temperature 0.7Example with your model:
cargo run -p oxidize-cli --release -- \
--model "/run/media/dih/8CEDA5F938E73A48/AI/models/HauhauCS/Qwen3.5-4B-Uncensored-HauhauCS-Aggressive/Qwen3.5-4B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf" \
--prompt "Write a 3 page essay on why llama cpp is better than LM Studio" \
--max-tokens 1500 \
--temperature 0.7Generation parameters:
--max-tokens N- Maximum tokens to generate (default: 512)--temperature T- Sampling temperature 0.0-2.0 (default: 0.8)--top-p P- Nucleus sampling threshold (optional)--top-k K- Top-k sampling limit (optional)
Output includes TPS tracking:
generation stats: tokens=200 speed=4540.92 tok/s
cargo run -p oxidize-cli -- --chatcargo run -p oxidize-server -- --host 127.0.0.1 --port 8080Health check:
curl http://127.0.0.1:8080/healthzcargo run -p oxidize-quantize -- \
--input /path/to/input.bin \
--output /path/to/output.bin \
--source F32 \
--target F16make test
make lintmake fmt- Check Rust formattingmake lint- Run clippy with warnings deniedmake audit- Run cargo-deny license/security auditmake test- Run workspace testsmake build- Build release binaries for all targetsmake wasm- Build oxidize-core with wasm-bindgen outputmake check- Run fmt + lint + testmake ci- Run check + build