Hello,
I'm trying to use rwkv.cpp in two different threads. For this, I'm loading the model and then using two context clones (via rwkv_clone_context). Everything works fine when each thread runs rwkv_eval, but when running simultaneously in two threads, I get an error:
GGML_ASSERT: /root/rwkv.cpp/ggml/src/ggml-cuda.cu:409: ptr == (void *) (pool_addr + pool_used)
GGML_ASSERT: /root/rwkv.cpp/ggml/src/ggml-cuda.cu:409: ptr == (void *) (pool_addr + pool_used)
It seems that alloc/free are called "out of order" for the two contexts. Any idea how to solve this?
Thanks!