From ef551af6c15af3fb7315edcbaf3f9dacf594ec5c Mon Sep 17 00:00:00 2001 From: Wen Shi Date: Fri, 28 Apr 2023 23:58:30 +0800 Subject: [PATCH] Correct the parameters of type given. By given `q4_0` as the type will cause this error `llama_model_quantize: failed to quantize: invalid output file type 0`. And per the doc, the type should be 2 if we need q4_0 type = 2 - q4_0 type = 3 - q4_1 --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2a20746c63b18..5c85ecbcc5a18 100644 --- a/README.md +++ b/README.md @@ -271,7 +271,7 @@ python3 -m pip install -r requirements.txt python3 convert.py models/7B/ # quantize the model to 4-bits (using q4_0 method) -./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0 +./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2 # run the inference ./main -m ./models/7B/ggml-model-q4_0.bin -n 128