Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Huge perplexity score generated by CLBLAST based on GPU of Android phone? #2133

Closed
@hchenphd

Description

@hchenphd

I try to use a Xiaomi phone to run perplexity by CLBLAST based on GPU , the model is tested on MacBook , but the perplexity scores are really huge, like [1]2717.5986,[2]3794.2774,

I am confused , and anyone can give some hints about this issue?

./bin/perplexity -m ../../storage/downloads/llm-models/ggml-model-q4_0.bin -f ../../storage/downloads/wikitext-2-raw/wiki.test.raw

main: build = 787 (7f0e9a7)
main: seed = 1688739070
ggml_opencl: selecting platform: 'QUALCOMM Snapdragon(TM)'
ggml_opencl: selecting device: 'QUALCOMM Adreno(TM)'
ggml_opencl: device FP16 support: true
llama.cpp: loading model from ../../storage/downloads/llm-models/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.08 MB
llama_model_load_internal: using OpenCL for GPU acceleration
llama_model_load_internal: mem required = 5407.72 MB (+ 1026.00 MB per state)
llama_model_load_internal: offloading 0 repeating layers to GPU
llama_model_load_internal: offloaded 0/35 layers to GPU
llama_model_load_internal: total VRAM used: 0 MB
llama_new_context_with_model: kv self size = 256.00 MB

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
perplexity: calculating perplexity over 160 chunks, batch_size=512
perplexity: 78.23 seconds per pass - ETA 3 hours 28 minutes
[1]2717.5986,[2]3794.2774,

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions