Huge perplexity score generated by CLBLAST based on GPU of Android phone?

I try to use a Xiaomi phone to run perplexity by CLBLAST based on GPU , the model is tested on MacBook , but the perplexity scores are really huge, like [1]2717.5986,[2]3794.2774,

I am confused , and anyone can give some hints about this issue?


> ./bin/perplexity -m ../../storage/downloads/llm-models/ggml-model-q4_0.bin -f ../../storage/downloads/wikitext-2-raw/wiki.test.raw
> 
> main: build = 787 (https://github.com/ggerganov/llama.cpp/commit/7f0e9a775ecc4c6ade271c217f63d6dc93e79eaa)
> main: seed = 1688739070
> ggml_opencl: selecting platform: 'QUALCOMM Snapdragon(TM)'
> ggml_opencl: selecting device: 'QUALCOMM Adreno(TM)'
> ggml_opencl: device FP16 support: true
> llama.cpp: loading model from ../../storage/downloads/llm-models/ggml-model-q4_0.bin
> llama_model_load_internal: format = ggjt v3 (latest)
> llama_model_load_internal: n_vocab = 32000
> llama_model_load_internal: n_ctx = 512
> llama_model_load_internal: n_embd = 4096
> llama_model_load_internal: n_mult = 256
> llama_model_load_internal: n_head = 32
> llama_model_load_internal: n_layer = 32
> llama_model_load_internal: n_rot = 128
> llama_model_load_internal: ftype = 2 (mostly Q4_0)
> llama_model_load_internal: n_ff = 11008
> llama_model_load_internal: model size = 7B
> llama_model_load_internal: ggml ctx size = 0.08 MB
> llama_model_load_internal: using OpenCL for GPU acceleration
> llama_model_load_internal: mem required = 5407.72 MB (+ 1026.00 MB per state)
> llama_model_load_internal: offloading 0 repeating layers to GPU
> llama_model_load_internal: offloaded 0/35 layers to GPU
> llama_model_load_internal: total VRAM used: 0 MB
> llama_new_context_with_model: kv self size = 256.00 MB
> 
> system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
> perplexity: calculating perplexity over 160 chunks, batch_size=512
> perplexity: 78.23 seconds per pass - ETA 3 hours 28 minutes
> [1]2717.5986,[2]3794.2774,
> 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Huge perplexity score generated by CLBLAST based on GPU of Android phone? #2133

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Huge perplexity score generated by CLBLAST based on GPU of Android phone? #2133

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions