Java tests failed when CUDA enabled on version 3.0.0

Hello!

I really appreciate that you have upgraded this project! 

However, there are still 2 tests that cannot pass. `testGenerateInfill` and `testCompleteInfillCustom`. The outputs would be something like this:

```
{"tid":"130286006306496","timestamp":1712589265,"level":"INFO","function":"update_slots","line":1772,"msg":"all slots are idle"}
{"tid":"130286006306496","timestamp":1712589265,"level":"INFO","function":"launch_slot_with_task","line":1066,"msg":"slot is processing task","id_slot":0,"id_task":21}
{"tid":"130286006306496","timestamp":1712589265,"level":"INFO","function":"update_slots","line":2082,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":21,"p0":0}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000767f016b3d0f, pid=20857, tid=20912
#
# JRE version: OpenJDK Runtime Environment (22.0+36) (build 22+36)
# Java VM: OpenJDK 64-Bit Server VM (22+36, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libllama.so+0x125d0f]  dequantize_row_q4_K+0x4f
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/yys/java-llama.cpp/core.20857)
#
# An error report file with more information is saved as:
# /home/yys/java-llama.cpp/hs_err_pid20857.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
```

I have built with the command `cmake .. -DBUILD_SHARED_LIBS=ON -DLLAMA_CUDA=ON -DLLAMA_CURL=ON`. 

Also, I have tested vanilla llama.cpp of tag b2619, with the same build args above and the same inference args (shown below), and it worked without crash:
```
./server -m PATH_TO_LLAMA_CHAT -ngl 43 --embeddings
```
and 
```bash
curl --request POST \
    --url http://localhost:8080/completion \
    --header "Content-Type: application/json" \
    --data '{                                                                                                                                                                              
        "n_predict": 10,
        "input_prefix": "def remove_non_ascii(s: str) -> str:\n    \"\"\" ",
        "logit_bias": [[2, 2.0]],
        "stop": ["\"\"\""],
        "seed": 42,
        "input_suffix": "\n    return result\n",
        "temperature": 0.95,
        "prompt": ""
}'
```

Anyway, other java tests have been passed.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Java tests failed when CUDA enabled on version 3.0.0 #54

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Java tests failed when CUDA enabled on version 3.0.0 #54

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions