Improve Llama.eval efficiency #1476

thoughtp0lice · 2024-05-23T02:01:42Z

In llama_cpp/llama.py, the eval function converts return value of self._ctx.get_logits(), which is a CtypesArray, to list then copy it into self.scores. Here the CtypesArray is directly converted to a numpy array which speeds up the conversion and copying. The speed-up is especially noticeable on smaller models with faster inference time.

abetlen · 2024-05-24T05:42:57Z

@thoughtp0lice thank you, that's perfect!

improve Llama.eval efficiency

ef091dc

thoughtp0lice changed the title ~~improve Llama.eval efficiency~~ Improve Llama.eval efficiency May 23, 2024

Merge branch 'main' into main

fa3da60

abetlen merged commit 5cae104 into abetlen:main May 24, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Llama.eval efficiency #1476

Improve Llama.eval efficiency #1476

thoughtp0lice commented May 23, 2024

abetlen commented May 24, 2024

Improve Llama.eval efficiency #1476

Improve Llama.eval efficiency #1476

Conversation

thoughtp0lice commented May 23, 2024

abetlen commented May 24, 2024